Tube Matching with Python, Part 2: Push, Pull, and Balance

For a hundred years, anyone who looked would find a combustion engine under the hood of a car.  Even though all combustion engines share a common goal (to convert a fuel’s potential energy to rotary motion), their designers all approached the problem from different angles, resulting in a variety of ideas, each with their own unique qualities.

The field of electronics evolved in a similar way.  Major challenges were approached from many sides, often resulting in a myriad of workable solutions, each with unique qualities that would hopefully be a perfect fit for the right application.


An early example of a push/pull audio amplifier, the 1924 RCA Radiola III

In the middle of the 20th century, audio amplification was an expensive problem to solve, and the cost was proportional to the amount of power (in watts) that was needed.  Manufacturers were in a race to offer more and more power to their customers, and as a result, circuit designs evolved to become more efficient.

The Push/Pull idea, a concept well-known to mechanical engineers, became widely adopted in audio amplifiers due to it’s increased efficiency.  The concept is simple: by doubling the number of power sources and setting them up to handle opposite phases of a load, power efficiency could be improved.

Two pistons in a push-pull system doing equal and opposite work toward the same goal

A Push/Pull amplifier is a symmetrical system similar to a two-cylinder engine, where two pistons take turns delivering power to the load.  Logically, two pistons should produce twice the power of one, but in a push-pull configuration even more power is achieved because energy is distributed to the load twice per cycle, and momentum is shared, minimizing losses.

A real-world example of this is the case of two lumberjacks sawing through a log from opposite ends of a saw.  It seems logical that they should get through the log in half the time it would take one lumberjack, but in reality they’ll finish even earlier due to efficiency gains.

An example of a Push/Pull system

The importance of Balance in push/pull systems

For a Push/Pull system to work smoothly, it is important that both power sources contribute equally to the load.  If one lumberjack works significantly harder than the other, this happens:

(The internet is so weird.  It took less than 30 seconds to find this perfect animation explaining imbalance in push-pull systems.)

Likewise, both cylinders in a two-cylinder engine must contribute equal power to prevent generating harmonically-related vibrations.  The same is also true for the Push/Pull audio amplifier — each tube must contribute equally for optimum results. 

“Matched” tubes

Since vacuum tubes of the same type tend to vary somewhat from one to the next, they need to be measured, sorted, and paired up with other tubes having identical characteristics.  Electric guitar players often buy “matched” tubes for their guitar amps for the same reason. 

Fortunately, the matching process for guitar amplifier tubes is straightforward:  measure a tube’s current at a fixed operating point, then compare it to other tubes measured the same way until matches are found.  In many cases, only one measurement per tube is required to achieve a decent match. 

Variable-Mu tubes

Variable-mu tubes used in audio compressors like the M97 require a more elaborate measurement scheme to obtain good matches.  Rather than comparing a single measurement per tube, a series of measurements across a range of operation must be taken on each tube and the results compared.

These tubes don’t match. If they did, their curves would perfectly overlap throughout their range.

Uh oh.

At some point I realized this would be a non-trivial problem for Electronaut; with over 1,200 variable-mu tubes in inventory and a dozen measurements per tube, I’d have to analyze 14,400 measurements to find the best matches.  Even if the measurement process could be automated, I still wasn’t quite sure how to manage the data and do the analysis.

The measurement gathering part of the project turned out to be fairly simple if labor-intensive.  Electronaut purchased and built one of Ronald Drekker’s excellent µTracer kits (pronounced ‘micro-tracer’) to perform the tube measurements.  The uTracer software exports a text file containing measurement data for each tube measured.  Once all the tubes are measured, there will be a folder with 1,200 text files in it, and from there it’s up to my Python chops to get meaningful data out of it.

I’ll also be making a jig to allow multiple tubes to warm up together, then be individually switched into and out of the measurement circuit.  That should save a lot of time performing the tube measurements.

Figure 1: A sample text file containing voltage and current measurements from a single vacuum tube.

The data source

Figure 1 shows a sample text file produced by the uTracer, containing the measurement data from a single tube, formatted as two space-delimited columns.  The first task of my Python program was to automate the importing and processing of all the text files into a master data structure of some kind that could be saved.

Planning out the works

I had originally envisioned some sort of cloud-based storage for the tube data, and a web-based interface to perform the analysis, but I also liked the idea of building a simple GUI and keeping the app only on the PC that performs the tube measurements in my lab.

All that changed when I discovered Jupyter notebooks.  Since the goal of my project is to make myself a tool I can use to solve a problem, (rather than an app or website for commercial use), Jupyter notebooks allow me to focus solely on getting the results, rather than spending time making the app look slick.  I love me a slick app, but in this case I just want perfectly matched tubes!

Import and Analyze

It made sense to me to think of my project as two separate smaller projects with separate functions, so I decided to structure the project that way, as two separate Python modules:  the Importer would read and process a batch of tube data files and save the data somewhere, and the Analyzer would read the saved data and provide options for analysis and visualization.

Since I had been reading a bunch about Python, watching youtube videos, and playing around with the Python interpreter in Terminal, I decided to go ahead and start working on the Importer module.

The Importer module

The convention in Python is to import all dependencies and declare any global variable(s) right off the bat, at the top of the code.  I’m using the os library (to allow me to interact with the operating system and see directories and files on my computer, etc.), and the tkinter library (to provide some GUI objects to be used when prompting the user for input).  In the case of tkinter, I’m using the filedialog and simpledialog objects.  Lastly, I’m using the pandas library to provide the tools I need to work with the tube data in a tabular data structure.

I’m also using a global variable to hold the tube type, which will be entered by the user when importing a batch of tube data files.  Declaring the global variable was a bit of a crutch I relied on when I couldn’t figure out how else to get it to work.  It bugs me now, and I’ll probably go back and figure out how to make it work without the global variable.  (Another thing to put on the to-do list.)

The next chunk of code is a function to prompt the user, using the tkinter filedialog object, to select a directory containing tube data files.

The filedialog object prompts the user to select the directory containing the µTracer files.  Once the user navigates to the desired directory and hits the ‘choose’ button, the ‘for’ loop iterates through every file found in that directory, checks if it has a .utd file extension, and if it does, adds the name of the file to a list.  When there are no more files to check, a list containing all the .utd files is returned.

One thing I noticed is that Mac OS Sierra does not include the title bar in tkinter filedialog objects anymore.  Compare the dialog box from Mac OS 10.7.5, which includes the prompt, “Please select a directory containing uTracer files”.

The ‘filedialog’ object, when run on older Mac OS operating systems, includes the title bar string.
Mac OS Sierra appears to ignore the title bar string altogether.


The next function prompts the user to enter the type of tube for the batch of tube data files being processed, and calls tkinter’s simpledialog object to handle the interaction.

Using tkinter’s simpledialog object to prompt the user for the tube type

Finally, the function that does the actual processing of the tube data files:

There are a few ways to open and read a file in Python, but as advised by my mentor, the ‘with/as’ statement is preferred because it guarantees that any file opened will always be closed when the code is finished with it.  With other techniques, it’s possible for the interpreter to never reach the call to close the file, causing a memory leak.  The with/as solves that with the addition of two dunder methods, __enter__ and __exit__.

Here’s a good article explaining the behind-the-scenes activity when the ‘with/as’ statement is used.

“…when the ‘with’ statement is executed, Python evaluates the expression, calls the __enter__ method on the resulting value (which is called a “context guard”), and assigns whatever __enter__ returns to the variable given by as. Python will then execute the code body, and no matter what happens in that code, call the guard object’s __exit__ method.”

After the with/as statement has opened the file, the contents of the file is read into a string called file_contents_string.  This string contains the entire file, with all the spaces and line breaks, as one continuous series of characters.

The split method is then called on the string, splitting up the data into a list.  It’s possible to pass a delimiter as an argument to the split method, but if no argument is passed, it will default to using a blank space or a series of spaces as the delimiter, which is what I needed.

At this point, the file_contents_list variable is an ugly mess and looks something like this:

['Vg', '(V)', 'Ia', '(mA)', 'Va', '=', '325', 'V', '-50', '4.786', '-46', '5.514', '-42', '6.52', '-38', '7.82', '-34', '9.485', '-30', '11.678', '-26', '14.559', '-22', '18.289', '-18', '23.132', '-14', '29.463', '-10', '39.155', '-6', '58.719', '-2', '91.783']

My first thought was, “there’s stuff in there I don’t need.  I should get rid of it.”

Now I had a list that just contained the measurement data, without all the labels at the beginning of the list.

Later, when I met with my mentor for the first time, he called out my use of the del statement, saying it was a ‘code smell!’  My first ‘code smell!’  I guess there’s a first for everything.

Of course, he’s totally right.  Why write code to build a list, then write more code to immediately change it?  Ultimately, I’m going to be pulling values out of the list and giving them names, then I won’t need it anymore, so who cares if it has stuff I won’t use?

To extract the voltage measurements from the list, I use a list comprehension to build a new list called x_values.  The list comprehension iterates through the file_contents_list from the 8th index (skipping the stuff at the front of the list), to the end of the list, in steps of 2, retrieving every even-numbered value in the list.

At this point, the x_values list looks like this:

['-50', '-46', '-42', '-38', '-34', '-30', '-26', '-22', '-18', '-14', '-10', '-6', '-2']

The current measurements are extracted in a similar way, and are assigned to a new list called y_values.  This time the list comprehension iterates from the 9th index to the end in steps of 2, retrieving every odd-numbered value.

['4.786', '5.514', '6.52', '7.82', '9.485', '11.678', '14.559', '18.289', '23.132', '29.463', '39.155', '58.719', '91.783']

Next, the final value of interest in the list is in position #6, so it’s is assigned to a dictionary object named anode_voltage.

Another dictionary object named tube_ID is created, and its value is derived from the name of the tube data file, ignoring the last 4 characters (the file extension).  I then use a trick to combine the anode_voltage and tube_ID dictionaries into one.  Apparently, this has only recently been added to Python (since 3.5):

And finally, I use the zip statement to create a new dictionary, using the x_values list as the keys and the y_values list as the values.  This new dictionary is combined with the tube_data_extras dictionary to create a temporary master dictionary object containing all the relevant data from that particular tube, called tube_data_dict.

The last function defined is the Main function.

The batch_list variable calls the choose_folder function, and the tube_type variable calls the ask_tube_type function.  Then a loop begins, iterating through the list of files in the directory (batch_list), and processing each one using the import_tube_data_file function.  The loop produces a tube_data_dictionary object for each tube processed, which is appended to a master_tube_list.

Once the master_tube_list is built, it’s time to save the data somewhere.  I decided to save the data to the root level of the program itself, wherever it resides.  Again, this requires use of the os library.

The first line is basically saying, “Take the absolute path of myself” (the script being run).

The second line is telling the OS to change the current working directory to the directory contained in the script_path variable.

Finally, the Pandas library is used to export the data.

The first line builds a pandas dataframe called df, using the master_tube_list of dictionary objects as the data source.

The second line calls the .to_csv method on the dataframe, which saves the data as a CSV file called master_tube_list.csv into the current working directory.

And finally, the boilerplate code that is required to call the main function when the script is executed:



This description represents the Importer module as of late May 2017.  It went through a lot of versions before arriving at this point, and as I progressed I found ways to make things simpler.

It’s funny to me that an earlier working version consisted of many more lines of code, yet this current version accomplished the identical task with far fewer.  I have little doubt that further refinements could be made, and as I mentioned earlier, I’ll probably make them at some point.  For now though, this tool is doing exactly what I need it to do:  take a gigantic pile of oddball files and organize them into a standard format that I can use anywhere.


The next part of the project is the Analyzer module, where I’ll finally get some answers!  I’m planning on using least-squares linear regression to compare each tube to the rest of the tubes in the dataset, then make an ordered list of matches for each tube.

I’m also planning on exploring some of the data visualization and graphing libraries available to Python.  My hope is to present the data in a clear enough way visually that it serves as a confirmation of the predicted results from the linear regression.


Next: Tube Matching with Python, Part 3 — Finding the Least Squares