Open Binary – Introducing a Practical Alternative to Open Source

I’ve been thinking about not only announcing releases and features, but also discuss (read: point and laugh at) some very common annoyances. I hope to one day be seen as a meaner alternative to The Daily WTF but with less free mugs and more open source. Anyway, on to this post’s subject…

Open Binary – source code so obfuscated, “optimized” and arcane that despite an open source license nobody can edit or benefit from reading it. Your only hope is to compile it into a binary and hope it works. There are plenty of examples of this in the video world, such as mplayer, most popular Avisynth filters and to be honest almost every single piece of code written in the field of video processing.

So how do you produce an Open Binary? Well, in my opinion you have to put effort into multiple levels to succeed. For example one important step is to OPTIMIZE! And by that I mean bitwise shifts! No compiler can ever figure out that a/2 can be compiled to a right shift so you have to help it, it also makes the code faster. Another important detail to know about CPUs, even the most modern ones, are that they are slow readers. Armed with this knowledge make all variable names short so there’s less to read for the poor CPU. For text parsing we can actually do one better since modern CPUs are good at numbers, simply use the ascii code instead of the letter in any text operation.

An example of proper text parsing taken from TIVTC:

if (*linep != 0)
{
	qt = -1;
	d2vmarked = false;
	*linep++;
	q = *linep;
	if (q == 112) q = 0;
	else if (q == 99) q = 1;
	else if (q == 110) q = 2;
	else if (q == 98) q = 3;
	else if (q == 117) q = 4;
	else if (q == 108) q = 5;
	else if (q == 104) q = 6;
	else
	{
		fclose(f);
		f = NULL;
		env->ThrowError("TFM:  input file error (invalid match specifier)!");
	}
	*linep++;
	*linep++;
...continued for several hundred lines

There are several other techniques you can use too, for example writing pure assembler, or even better, inline assembler which effectively will tie all your code to one platform and compiler too at the same time in addition to being near impossible for anyone to modify or understand! You can also play the shell game with pointers and global variables, have one function add an offset to a pointer and pass it to the next which subtracts it again. The secret it is to put spaghetti in your sauce, so to say.

So who should use this approach to leverage the open source benefits? Big evil companies of course! Sure, you’ll have to reveal the source code but no one can ever use it for anything anyway. This is the end of part one of my “Business Strategies for the Modern Monopolist” series. I’ll be posting part two shortly.

Here’s a final example of successful use of assembler only to make a true Open Binary, again from TIVTC as I’ve spent far too much time staring at it recently. The actual post ends here so you don’t have to scroll down to look for more.

__asm
{
	mov y, 2
yloop:
	mov ecx, y0a
	mov edx, y1a
	cmp ecx, edx
	je xloop_pre
	mov eax, y
	cmp eax, ecx
	jl xloop_pre
	cmp eax, edx
	jle end_yloop
xloop_pre:
	mov esi, incl
	mov ebx, startx
	mov edi, mapp
	mov edx, mapn
	mov ecx, stopx
xloop:
	movzx eax, BYTE PTR [edi+ebx]
	shl eax, 3
	add al, BYTE PTR [edx+ebx]
	jnz b1
	add ebx, esi
	cmp ebx, ecx
	jl xloop
	jmp end_yloop
b1:
	mov edx, curf
	mov edi, curpf
	movzx ecx, BYTE PTR[edx+ebx]
	movzx esi, BYTE PTR[edi+ebx]
	shl ecx, 2
	mov edx, curnf
	add ecx, esi
	mov edi, prvpf
	movzx esi, BYTE PTR[edx+ebx]
	movzx edx, BYTE PTR[edi+ebx]
	add ecx, esi
	mov edi, prvnf
	movzx esi, BYTE PTR[edi+ebx]
	add edx, esi
	mov edi, edx
	add edx, edx
	sub edi, ecx
	add edx, edi
	jge b3
	neg edx
b3:
	cmp edx, 23
	jle p3
	test eax, 9
	jz p1
	add accumPc, edx
p1:
	cmp edx, 42
	jle p3
	test eax, 18
	jz p2
	add accumPm, edx
p2:
	test eax, 36
	jz p3
	add accumPml, edx
p3:
	mov edi, nxtpf
	mov esi, nxtnf
	movzx edx, BYTE PTR[edi+ebx]
	movzx edi, BYTE PTR[esi+ebx]
	add edx, edi
	mov esi, edx
	add edx, edx
	sub esi, ecx
	add edx, esi
	jge b2
	neg edx
b2:
	cmp edx, 23
	jle p6
	test eax, 9
	jz p4
	add accumNc, edx
p4:
	cmp edx, 42
	jle p6
	test eax, 18
	jz p5
	add accumNm, edx
p5:
	test eax, 36
	jz p6
	add accumNml, edx
p6:
	mov esi, incl
	mov ecx, stopx
	mov edi, mapp
	add ebx, esi
	mov edx, mapn
	cmp ebx, ecx
	jl xloop
end_yloop:
	mov esi, Height
	mov eax, prvf_pitch
	mov ebx, curf_pitch
	mov ecx, nxtf_pitch
	mov edi, map_pitch
	sub esi, 2
	add y, 2
	add mapp, edi
	add prvpf, eax
	add curpf, ebx
	add prvnf, eax
	add curf, ebx
	add nxtpf, ecx
	add curnf, ebx
	add nxtnf, ecx
	add mapn, edi
	cmp y, esi
	jl yloop
}

R13 – Conditional Filtering and Memory Optimizations

It’s time for another release since it’s been over a week. The new things are a redone system for accessing frame properties and this time it’s less awkward and arcane, the possibility to write a full filter in python only (if you’re clever enough to figure out how to abuse ModifyFrame) and the memory management has been enabled. This means that VapourSynth will aggressively try to keep the amount of used framebuffer memory below 1GB to avoid running out of address space.

I also added all useful internal Avisynth filters turned into a standalone plugin to the downloads. It should make the transition to VapourSynth easier while waiting for your favorite internal filter to be ported. If your favorite filter happens to be a simple one I suggest you give porting it a try yourself.

As I feel the core is almost complete now I will focus on creating more automated regression tests, documentation everything and tweaking the automatic cache size adjustment for the next release. Phase one of the project is nearing the end. After that I will focus on porting popular filters properly, as in making them work on Windows, Linux and OSX in both 32 and 64bit mode. As some of you may have noticed I’ve already ported EEDI3 and I’m currently working on TIVTC, a difficult project (but not for the right reasons) which I’ll write another post about.

R12 – VapourSynth Takes a Step in The Enterprise Direction

This new version has something for everyone.  It has a round of bug fixes, one of them to the threading which means that it should be able to completely max out a 4 core CPU when running mdegrain2. The other features (requested through donations) are support for v210 output, the most used 10 bit format in professional video editing. To enable this output in VSFS and VFW add this to your script:

last = yourvideo
enable_v210=True

If you do not add it the output will default to P210. The documentation will be updated later today with more detailed installation instructions for VSFS. For those of you who can’t wait the install method is very similar to AVFS (just look for vsfs.dll).

R11 – VFW returns and Python 3.3

VFW has been debugged. Greatly. I’ve also added high bitdepth output support. However v210 shall be left out (I hate packed formats) unless someone contributes a patch or requests it in a donation message. VFW also has some behavioral changes such as returning a clip with colorful bars on error. This version also requires the Python 3.3 as this cuts down on the number of copies of visual studio I have to keep around.

Note that it is possible and quite easy to recompile the python and VFW modules for other versions. Maybe someone will contribute a python 2.7 compile one day.

R10 – VFW enters the scene from the left

Another few days have passed and the VFW module is now ready for a first test out in reality. It supports all the same output formats as Avisynth 2.6 plus one more! (which you will probably never use) I have decided that the official extension for these virtual avi scripts should be .vpy. Think Video PYthon script to remember it. It was the best I could come up with that wasn’t used by something else as well. The VFW part has been tested with AVISource in Avisynth, MPC-HC and VirtualDub. Out of these VirtualDub is the only one that needs a small workaround since it detects the special compatibility mode based on file extension, you MUST SELECT the compat option in the File\Open dialog or it won’t open.

Of course there’s also a bunch of other changes such an installer that comes with all the needed runtimes, the usual pile of  Python module fixes and other things. Release early, Release often as the saying goes. My schedule is to try to make a new release every time I cross off something on the todo list.

I have also added a donation button. At least consider using it. Click on it once and then quickly close the window if you don’t want to donate anything. I will however never threaten to stop the project or anything else that’s close to drama if donations are low.

R9 – the mostly done release

This is the big release. The new things are:

  • Full documentation of all the included functions
  • The full source released as LGPL
  • Fixes bugs in most of the internal filters
  • Adds Python + operator and slicing support to clips
  • Fully working on linux (and probably osx too if you want to try compiling there)
  • Many other small things like more fixes to y4m output

With this release I consider the API stable and the core mostly feature complete. So start writing and porting plugins!

What’s next on the todo list (in no particular order):

  • Write a masktools replacement (probably with asmjit)
  • Write a general subtitling plugin, something like AssRender for Avisynth
  • Figure out what to do about the horrible resizers
  • Write a vfw module
  • Make it possible to write full plugins and not just functions in Python
  • Document the C API and make a small example of how to write an Invert filter
  • Other stuff

 

R8 – The sad interim release with some bugfixes

The title says it all, I hoped to have a bit more ready but this is at least a bit over halfway to my stated goals.

The big news is the addition of a function type. This allows some interesting things like this:


def selector1(sdict):
 a = sdict['N']
 a = a % 2
 b = {'val':a}
 # return the index of the clip to select
 return b

def selector2(sdict):
 # for these functions the key 'N' holds the requested frame number
 # and FX corresponds to the index of the frames
 frame_properties = get_props(sdict['F0'])
 if frame_properties['SomeFrameProperty'] < 0.5:
 return {'val':0}
 else:
 return {'val':1}

# some source for clip0 and clip1 here
ret = core.std.SelectClip(clips=[clip0, clip1], src=[clip0], selector=selector1)

# ret will now return every other frame from clip0 and clip1 interleaved

# selector2 shows how frame properties can be factored into it

R7 – the source is slowly coming

After almost 2 weeks of rapid improvement R7 is done. It mostly introduces more improvements for the python module, which now accepts unnamed arguments and has improved y4m output. The most important detail being the addition of a B value to the y4m header so it can signal higher bitdepths automatically too. Since it’s a minor change I hope x264, libav and other tools will adopt this property soon.

Another new thing that may interest some of you is that this build also comes with the header for developing your own plugins plus a the full source for all the functions in the std namespace (MIT licensed). I still have a few more things I want to add and revise before releasing the rest of the source to the public.

So instead of giving an vague “soon” as a release schedule for the source I will simply give you a list of things I want to finish before it goes out the door:

  • Decide which license to release it under. (discuss it if you want but I’m tilting towards something like free for non-commercial use, source modifications must be shared)
  • The addition of a function type. This is mostly for internal use as python handles callable objects well enough.
  • The ability to write whole filters in python. It would make quick prototyping easy and open up filter writing for more people (unfortunately python itself has one huge mutex called the GIL so if it’s used too much it could lead to serious slowdowns).
  • More checks, one of the most important things in my opinion is to detect errors early and report them. Especially for developers. The core already has a lot of checks in its handling of external plugins so new plugin writers will become aware of their mistakes.
  • Finish the standard functions. The main missing functions are Transpose, CropRel and several small ones for property manipulation and per frame selection.

Maybe I’ll reveal the long term goals next time.