Technium Adeptus: 2014

Thursday, November 20, 2014

Misguided and Hazardous Language Features

Every programming language has features that are problematic. Some make you shake your head in embarrassment. Some make you cringe. Some make you wish you had never learned about them. In some cases, they are archaic artifacts of assembly language or good ideas that are now obsolete. Other times, they seemed like good ideas, but ultimately cause more problems than they are worth. Books and courses on programming often mention these "features" in passing, with a brief section on how they work and a warning to never use them. Some of these are so bad, however, that it is tempting to leave them out entirely. While it makes sense to avoid teaching techniques that should never be used, it is also dangerous.

Perhaps the most well known language feature that should never be used is the goto. Goto's are an artifact of assembly language and other unstructured programming languages. They are still highly relevant, because machine languages do not know about things like loops. Structured flow control is implemented using gotos at the machine level. Perhaps this does not matter to the typical programmer, but it does matter to compiler writers and anyone else that needs to do any degree of assembly level optimization (writing device drivers also frequently requires assembly programming). In the highest end programming jobs, it is important to understand gotos. Most books and college courses on C or C++ do not even mention gotos, and this is something of a shame. Their reasoning is fairly sound: Gotos should never be used in a structured programming language. If students are never taught that gotos exist, they will never use them. There are two flaws in this though. First, what happens when a college graduate is asked to maintain some old code with gotos in it? It may be true that they should not be there, but that does not change the fact that not knowing a very basic language feature will not look good to peers and supervisors. The second flaw with this is that they are probably going to find out about them anyway, and without any education on them, they may not realize that gotos are bad. It is actually fairly common for programmers to try out newly discovered language features wherever possible, either to show off or to practice using them. Someone who does this is going to get a worse reputation than someone who does not know what a goto is in the first place.

Another ill conceived language feature, that probably seemed like a good idea at the time, is JavaScript's "with" statement (it is nothing like a Python with). The "with" statement pulls all attributes of an object into local scope. In cases where the reference tree is fairly deep, this can actually improve performance just by reducing the amount of text that the interpreter needs to parse. Sadly, it also clutters the name space in the scope of the "with" block in ways that are often unintuitive and subject to change. Minor changes in the JavaScript language can alter how naming collisions are handled within the "with" block, making behavior version dependent. The "with" statement is so prone to error and confusion that it is forbidden to use it in strict mode. Few JavaScript tutorials and books even mention it. The problem here is that "with" is far more nefarious than goto. A bit of critical thinking can quickly reveal that gotos are unnecessary in a structured programming language. A bit of critical thinking about "with" makes it seem like a totally awesome idea. It reduces typing, and it can improve performance. The ambiguity it can cause is not obvious at all, and the issues with name collisions are also unlikely to be noticed until one hits. Ignoring the "with" statement is a horrible idea, because it will eventually get found and used, if students are not warned.

Ignoring bad language features is just asking for trouble. Besides the fact that many programmers will eventually come across them anyway, and they will not look very smart if they are not at least aware of them, if they are not warned, they are more likely to use them when they finally do discover them. If we do not want programmers using gotos, letting them discover them on their own is not the right way to do it. Newly discovered language features are always tempting to try (this is, after all, how we learn to use them effectively). If students are not warned, they will eventually discover the feature anyway, potentially with severe consequences. (There are some programming companies that will fire a programmer who uses a goto on the spot, with no questions asked.) An education in a programming language is not complete without warnings about what not to do. A course or book that claims to teach a programming language is not complete without warnings about the bad features of that language.

Wednesday, October 22, 2014

C Programming: Encapsulation

Encapsulation is perhaps the most valuable aspect of object oriented programming. Without encapsulation, object orientation would not have been useful enough to become popular, in a large part because most of the other useful aspects of object orientation rely heavily on encapsulation. Encapsulation is the intuitive grouping of related data into coherent blocks. In C, the most common form of encapsulation is grouping related functions into libraries. This only brushes the surface of the possibilities though, because a library is only a single instance of an encapsulated entity, and additional instances cannot be made. In object oriented programming, encapsulation is typically used to group related variables together with functions used to manipulate them. Encapsulation's strength is that it can be used to logically order data and operations in a way that is easy to understand and remember.

Encapsulation is not inherently beneficial in programming. It does not improve program efficiency, and it often harms it. Encapsulation can improve development speed and make program source code easier to understand. This can substantially reduce the time required to create a program, and it can also make maintenance far easier. Object oriented programming languages and styles are primarily popular because encapsulation increases development speed, thus increasing potential profits. The benefits of encapsulation are primarily business benefits, not benefits to the program itself.

Encapsulation comes at a heavy cost. While it may be worth the cost to gain the benefits, it is still important to be aware of the costs. Encapsulation almost always results in both memory and performance costs. In object oriented languages, objects must store function pointers in addition to data. These function pointers take up memory, and large numbers of objects or objects with many functions can take up substantial amounts of memory. Contrasted with purely procedural languages, where each function call is explicitly stored in the code only once, object orientation can be very inefficient in memory usage. To add to this, many object oriented languages store additional information about objects, such as object type, even in the compiled program, which uses even more memory. This, however, is not the worst part.

Encapsulation changes how data is stored in memory. On older computers and embedded systems, this effect may be negligible, but on modern systems with advanced memory caching, it is very important. Processor caches are used to avoid slow memory access by storing frequently used data on very fast memory on the processor. This memory is called the cache. Modern processors use additional techniques to further improve cache performance. One of these is to load memory into the cache in chunks. These chunks will always contain the memory that is being accessed, but they also contain nearby sections of memory that are likely to be used by the program in the near future. Since modern processors are many times faster than memory, caching is important to get the full performance of the processor. However, how data is used in a program dramatically affects how efficiently the processor cache is used. A program that jumps around memory, accessing data from widely separated areas, will force the cache to load new data from memory (discarding the old data) very frequently. When data is not found in the cache and must be loaded from memory, this is called a cache miss. Cache misses take a long time, during which the processor will either be idle or executing a different program, reducing the performance of the program waiting for the data to be loaded. When a program loops through a contiguous array of data, cache misses are very infrequent, and performance is maximized. When a program jumps around, performance is dramatically reduced. This is important because organization of data in object oriented programs is different from how data is typically organized in procedural programs. Each object stores its data in a single location. An array of objects, even in contiguous memory (most OOP languages store objects on the heap, which is not guaranteed to be contiguous), that is looped through to access only one or two member variables will still load the entire objects into the cache. This causes a lot of unused data to cycle through the cache, displacing data which would have been used. The result is a dramatic increase in cache misses, which substantially reduces performance. A similar procedural program might store the "object" data in multiple arrays, only accessing (and thus caching) the data in the arrays that are actually used. This adds another benefit. Processor caches are divided into sections, where each section of the cache holds some section of memory. These sections are rotated through (typically the least recently accessed is overwritten by new data being loaded). Looping through two arrays at once can take advantage of this when one cache section holds part of one array and another holds part of another array. Using encapsulation, an array of objects will likely use only one cache section at a time, constantly missing and loading more data. Now, we have only looked at how encapsulation affects cache usage when an array of objects is contiguous in memory. This is not typical. Most object oriented languages dynamically allocate memory for objects, and dynamically allocated memory is often not contiguous. Looping through non-contiguous memory will cause cache misses at almost every iteration, causing severe performance loss. The costs of encapsulation are sometimes justified, but it is important to be aware of them when making design decisions. When using object oriented programming, it is important to understand that OOP is a human invention designed to making interfacing with computers on a low level easier. Computers do not "think" in objects, so there will always be costs to the translation between human ideas of objects and how computers actually work. Understanding the underlying architecture can help mitigate the costs of encapsulation, but it cannot eliminate them entirely.

Now we can discuss how to use encapsulation in C. C already has primitive support for encapsulation, which we can leverage to implement full encapsulation. Note that this is going to be ugly and unwieldy, and it should typically be avoided wherever possible. As an exercise in understanding the C language, however, this may be very useful.

C has a meta data type called a struct. Structs are used to create composite data types that essentially encapsulate data. Structs can contain only data. They cannot contain functions or code. Following is a simple example of a struct.

struct item {
int id;
int price;
int count;
char* name;
};

This code defines a struct of type "item". An instance of this struct could be created and manipulated with the following.

struct item can;
can.id = 1;
can.price = 120;
can.count = 12;

The name element would be assigned a pointer to a CString. This struct might be a data type for an inventory system, where id is the inventory id number, price is the price in cents (since it is an int), count is the number of items in stock, and name is a pointer to a character string containing the name of the item. This struct could be passed to and returned from functions like any other variable. We could make an array of this new type to hold all of the different items the store carries. Unlike most object oriented languages, a statically created array of these structs would be contiguous in memory (though, the character arrays pointed to might not be contiguous, and would certainly not be contiguous within the array). This helps keep our data coherent and understandable by human standards. Without the struct, we might instead create a price array, a count array, and a name array, and the ids could be the array indices for each item (we might even avoid storing ids this way, reducing memory costs). For the struct model, if we only looped through the array when we needed to access every element in the struct, we would not get any performance benefits from separate arrays, but since this is unlikely, we will be paying for coherency in performance (in an inventory system, where performance is not that important, this is probably justified).

Now that we have a struct for our inventory items, we might want to reuse this idea in our POS system. We might want a struct for transactions. A transaction might contain a pointer to an array of items (we cannot make the array static, since it may have a different size for each transaction). We could use the previous struct for items, but we would use the count element as the quantity purchased instead of inventory. When finalizing a transaction, we might need to calculate sales tax, so a transaction will need a sales tax as well as a function for calculating tax and setting the variable in the struct. Normally, we would just write a function that takes the struct and modifies it. Maybe we want to be able to use different algorithms for sales tax depending on the customer (business customers might have a lower tax rate, while out of state and government customers might be exempt), but we want to be able to treat all transactions the same. This gets into very basic polymorphism, which we will discuss in depth in a different article. For now, however, we need our transaction to use a different tax algorithm for different customers, and we do not want to have to keep track of tons of information to accomplish this. We want to be able to populate most of a transaction, then send it to a function to finalize it that can use the same procedure on all transactions.

While structs cannot contain functions, they can contain function pointers. So long as the functions pointed to have the same signatures (return values and argument lists), they can easily be called in another function that knows how to use them. We could create several functions for calculating tax, and then we could store a pointer to the appropriate function in the struct. Later, when we finalize the transaction, we can just call the function pointed to by the transaction, and it will calculate tax for us. We do not have to care which algorithm is used during finalization. Here is a very simple example of this.

struct transaction {
    int pretax;
    int tax;
    void (*gettax)(struct transaction*);
};

There is the struct. The pretax element will contain the sum of the prices of items being purchased (in real life, we would have an array of items, and the finalizer would calculate the pretax total from those). The tax element begins empty, and it will be populated by the function pointed to by the gettax element. The gettax element can point to any function that returns void and takes a single argument that is a pointer to a transaction (a pointer because we need to change the original). Now we need some tax functions.

void normalTax(struct transaction *t) {
(*t).tax = (*t).pretax * 0.05;
}

void exemptTax(struct transaction *t) {
(*t).tax = 0;
}

These two functions return void and take transaction pointers, just like the function pointer in the struct definition. The first calculates a 5% sales tax, while the other sets tax to 0 for tax exempt customers. Now, we want to create a transaction. This will be a normal customer, with normal sales tax, who is making a purchase that totals to $10.00 (since we are using ints to store cents, it will be 1000).

struct transaction t;
t.pretax = 1000; // $10.00
t.tax = 0; // Initialize to 0
t.gettax = *normalTax; // Normal customer

There is our transaction. If we were serving a tax exempt customer, we could set gettax to *exemptTax instead. Now we are ready to calculate tax. This can be done with the following line, regardless of what tax algorithm we are using.

(*t.gettax)(&t);

That will call whichever tax algorithm we selected earlier, passing the transaction in by pointer, so we can set the tax element appropriately. After running the above, we will find that t.tax is equal to 50, which means $0.50. If we had used the tax exempt algorithm, the tax would have been 0.

This example is obviously contrived, and in real life we probably would have explicitly called a different function for each tax mode. In fact, this would probably be a better way to do it for this situation, but this sort of encapsulation has its strong points in many other applications.

Again, this is an ugly and unwieldy way of implementing encapsulation. If you really need encapsulation, it would probably be better to use an object oriented language like C++. In the rare situation where that is not an option or where you only need very basic encapsulation and only to a very limited degree, this might be appropriate. If you are considering doing this, first ask yourself why you are using C in the first place. It is very likely that the reason you are using C is because object oriented programming is much more expensive in memory and performance. If this is the case, you should probably find a more efficient way of solving your problem.

Here is the source code for a simple C program that implements and uses the transaction example from above:

#include <stdio.h>

struct transaction {
    int pretax;
    int tax;
    void (*gettax)(struct transaction*);
};

void normalTax(struct transaction *t);
void exemptTax(struct transaction *t);

void main() {
    struct transaction t;
    t.pretax = 1000;        // $10.00
    t.tax    = 0;
    t.gettax = *normalTax; // *exemptTax for no tax

    (*t.gettax)(&t);

    printf("Price: %i\n", t.pretax);
    printf("Tax:   %i\n", t.tax);
}

# For normal 5% sales tax
void normalTax(struct transaction *t) {
    (*t).tax = (*t).pretax * 0.05;
}

# For tax exempt customers
void exemptTax(struct transaction *t) {
    (*t).tax = 0;
}

Friday, September 26, 2014

C Programming: Singleton Design Pattern

The Singleton design pattern is a common pattern used in object oriented programming. To use the pattern, any constructors of the singleton object must be private. The user must not be able to create new instances of the object explicitly. In this design pattern, the class must only be able to have a single instance. Often this instance is created the first time it is requested, but it may be created at startup time, depending on the programming language. Subsequent requests will be provided with the already existing instance. This design pattern is primarily useful in languages that require object orientation, as a place to collect related global variables and functions, where only one instance of the collection should ever exist. It is less often used in languages that allow object orientation but do not enforce it. There are some cases, however, where it is useful regardless of the language.

One place where the Singleton design pattern is useful regardless of language is the case where a single instance of a global variable is necessary, but it is also necessary to limit how the user may interact with that variable. In 3D graphics, the main camera is one of these global variables. The camera can be stored as a pair of vectors, one representing "up" and the other representing the direction the camera is facing. The third vector, the facing of one of the sides, can easily be calculated from the other two. It is essential, however, that the "up" and "facing" vectors always be perpendicular to each other. If they ever become parallel, the third vector cannot be calculated, and the camera math starts to get zeros and infinities where they do not belong. This makes it impossible for the computer to render graphics that make sense. The Singleton pattern can be used to solve this problem. A single instance of a camera class can be made where the main camera is a private variable of the class. The setter for the camera can ensure that changes to the camera never allow invalid states. Further, methods can be added to the Singleton that allow the user to apply specific transformations to the camera, which removes the burden (and risk) of users trying to do the math for the transforms themselves.

In most cases, the Singleton design pattern is used to hold global things where the language does not provide a better option. In some cases though, this design pattern can be useful in its own right. A problem occurs when the benefits of this design pattern are necessary in a language that does not support object orientation. For example, the C programming language has no object orientation support, but embedded systems often have limited support for languages other than C (or assembly). This may not be true of all non-object oriented languages, but the Singleton design pattern is actually possible in C.

This C Programming series is going to discuss how to use object oriented principles in the C language. In most cases, it is probably a bad idea to use these principles if any other option is available, but in cases like embedded systems, where an object oriented language is not available, it may be necessary, or at least substantially more efficient, to use these principles. The remainder of this article will discuss using the Singleton pattern in C and demonstrate how it can be done.

In C, encapsulation and hiding sensitive data is generally considered impossible. Very basic encapsulation can be accomplished with structs, but the language does not have any built in mechanics for preventing a user from changing any variable that is in scope. This means that protecting a global variable in a getter/setting fashion is impossible. This leads to several difficulties. The first is that it is impossible to enforce data validation. A well designed library might offer setters and getters, but a user of the library might choose to go around them, accessing the variable directly. This puts the burden of correctness on the user, which has proven problematic enough to justify the wide adoption of private and protected variables in object oriented languages.

There is a simple way of making private global variables in external libraries. This is probably nothing new, and has likely been used in many C libraries that use internal state machines. It is not, however, often taught in computer science classes. In C, libraries are contained in separate files from the main program. Each library has at least one source code file as well as a header file. The header file exposes interfaces contained in the library to the program that is using the library. Global variables are exposed with an "extern" statement. If they are not exported, the main program does not even know they exist and thus cannot access them. This does not mean that they do not exist though. The library where the variables are declared can still access them. If this library has exposed functions that can change the hidden variables, then the main program can still access them indirectly. This technique can be used for functions as well. Following is some example code for a C library using what amounts to the Singleton design pattern.

private.c:

// This variable is subject to strict
// requirements.
int private_variable = 5;

// private_variable must be between 5 and 10
// inclusive. Invalid input will be ignored.
void set_private(int input) {
    if (input < 5 || input > 10)
        return;
    else
        private_variable = input;
}

// We don't want to expose the variable or its
// memory address, so we use a getter to return
// by value.
int get_private() {
    return private_variable;
}

private.h

// The hidden variable must be between 5 and 10
// inclusive. Invalid input will be ignored.
void set_private(int input);
int get_private();

The source code is pretty straight forward. It has a single global variable, a setter, and a getter. For some reason, it is necessary to restrict what the variable is allowed to be, so the setter handles that by ignoring invalid input. The header file is pretty straight forward as well. It exposes the two functions but not the variable. To expose the variable, "extern int private_variable;" could be added to the header file. Notice also that the comment in the header file does not name the variable. If this was distributed as a header and a precompiled object file, the user would not be able to figure out the name of the variable without searching though the object file for intelligible text and then guessing. If the header reveals the name of the variable though, an injudicious user might add an "extern" statement to the header to gain access. Of course, any user that goes to this effort deserves whatever problems it causes, but there is no reason to make it easy. Here is a driver program to test the library with.

main.c

#include <stdio.h>
#include "private.h"

void main() {
    printf("Private = %i\n", get_private());
    printf("Setting Private to 10\n");
    set_private(10);
    printf("Private = %i\n", get_private());
    printf("Setting Private to 30\n");
    set_private(30);
    printf("Private = %i\n", get_private());
    printf("Setting Private to 0\n");
    set_private(0);
    printf("Private = %i\n", get_private());
    printf("Setting Private to 7\n");
    set_private(7);
    printf("Private = %i\n", get_private());
}

Try adding some code to access private_variable directly. It will not compile. The main program does not even know that variable exists! It can still change and read the variable indirectly through the setter and getter functions though.

This is not all. Using this same technique, it is possible to put private functions in the library (perhaps for implementation hiding, or maybe just to keep the namespace uncluttered). Any function in the library can be called by other functions in the library, but they can only be called externally if the function prototype is included in the header file. This makes it easy to use the object oriented ideas behind private variables and functions in C. The library represents the object in this case, and the header file determines what is exposed and what is hidden.

This is not one of the object oriented principles that should be avoided if possible. This method of encapsulation and data protection is very straight forward. It is not prone to abuse or errors (and in fact, it is actually designed to reduce the potential for errors). Most of the rest of this series will discuss less stable and manageable techniques that should be used only when absolutely necessary.

Thursday, September 18, 2014

Object Oriented Programming

In my studies, work, and research, I have discovered some important things about Object Oriented Programming, Objects, and how each should be used. I want to examine some misconceptions and less well known facts about objects in programming.

Objects are an abstract data type. At the deepest level, an object is a highly flexible template for creating custom data types. This makes objects a data type of data types or a meta data type. Objects are far more than this though. Objects are an amalgam of useful ideas commonly used in programming and programming languages.

Simply put, objects are containers. Objects can contain data and functions. This last part seems pretty novel. An object is a data type that can contain functions. Further, when a contained function is called, it automatically knows which instance of the object it belongs to. These ideas seem very novel. It turns out that they are not.

Objects have some dirty secrets. Objects are hiding places for global variables. In some cases, like the Singleton design pattern, this is easy to see. In other cases it is not. Objects are also containers that often hide the passing of large argument sets to functions. When used properly, this does not usually cause problems, but it can easily hide massive coupling issues. Objects can easily hide poor programming practices, and some common uses for objects would be considered poor programming if they were done without objects.

It turns out that in many cases objects are unnecessary. Because objects have higher overhead than more primitive data types that can be used for the same things, it is important to know where objects will be beneficial and where they may be detrimental. In some cases, it is a matter of trade off between development time and performance, and a judgment call must be made. In many cases, however, objects are unnecessarily used in places where performance is harmed, but no benefits to development time are gained.

Now I want to look at some examples of gaining some of the benefits of objects without actually using objects. This can result in improved performance without sacrificing anything for it.

A few months ago, I was writing a C program where I needed to keep track of a camera in 3D space. It was important that the user be able to create new camera instances, but it was also important that a main camera exist. The main camera would be used for all graphics calculations, and the user could load different camera instances into the main camera. This design had several benefits. One was that the user did not have to pass a camera to the graphics functions every time they were called. Since the graphics functions would typically be called many times per video frame, the overhead of argument passing was an important bottle neck. The other benefit was that the camera required some internal consistency to work properly. The camera consisted of two vectors that were absolutely required to be perpendicular to each other. Allowing the user to directly modify these vectors would make the graphics functions prone to user error, and it would further put a burden on the programmer to ensure that any direct modifications would maintain the vectors properly. In an OOP paradigm, this is an easy problem. The main camera could be made a private member variable in a singleton object, and all access would be controlled with getters and setters. In C, however, objects are not supported. Instead I had to use a novel approach that turned out to be at least as easy as an object but with lower memory and argument passing overhead. The camera handling functions were already contained in a separate file from the main program (to facilitate reuse). So, I put a (global) struct instance for the camera in the .c file for the camera library, but I did not export it in the header file. All of the graphics functions that required a camera were contained in the .c file, so they had direct access to the main camera struct. The main program did not have access to it though. I added some getters and setters to the camera library to allow restricted access to the main camera. The result of this was that I used the Singleton design pattern and I even encapsulated the main camera data, all in a programming language that does not have any support for objects. It also improved program efficiency in several areas.

This experience lead me to another conclusion: Objects are syntactic sugar. In my instance, with the Singleton, I literally avoided passing arguments to frequently used functions. When using large numbers of the same object type, this is impossible. If there had been some benefit to having and using multiple cameras, and if it was normal to use a different camera for each call, passing cameras as arguments would have been more efficient, and in most uses of objects, this is the case. Objects contain references to functions associated with that object type. One benefit of using objects is that the compiler handles the problem of which object instance belongs to which function call. In C, I would have to pass the appropriate data each time I called a function, even if I contained the function references in a struct with the data. The benefit of objects does not, however, improve efficiency. Instead it hides the passing of the argument. This is called "syntactic sugar," because it makes the syntax shorter and faster to type, presumably without making it harder to understand. Syntactic sugar is typically good when it is done well, and in most OOP languages, it is done fairly well. It is important to understand that syntactic sugar does not actually affect program performance though.

Now, as I mentioned before, objects are a meta data type. They are a data type for defining new data types. In this way, they can be very useful. When used properly, they can make programs much easier to develop, read, and understand. Objects in programming are very valuable. High value, however, does not mean that it is appropriate to use objects exclusively. Imagine using structs exclusively in C. We could definitely "encapsulate" all of our functions and variables into structs. We could even define a secondary main function, contain it in a struct, and then call it from main as soon as the program starts (and, in fact, in Java and highly object oriented uses of C++, this pattern is highly recommended). The result would be extremely difficult to read and understand, it would waste substantial amounts of memory, and it would destroy the ability of the compiler to optimize memory usage for cache efficiency (this last one is a problem of all object oriented languages). There are some places where using objects just does not make sense. For some program types, encapsulating most things might work well. For others, it is a waste of time. Selective use of objects can optimize design time where necessary while allowing for optimized performance where it is important. It also turns out that for many tasks, the time spent writing the paper work for the object takes far more time than writing the executable code. In short, objects should not be used where they do not make logical sense. There is no benefit derived from using objects where they are unnecessary and make no sense. Using objects exclusively is like using structs, trees, or any other data structure exclusively. It might be a fun exercise for a challenge, but there is almost no practical application where it is appropriate.

Nearly all of the elements of object oriented programming can be used separately in most modern programming languages. Most newer languages already contain large amounts of syntactic sugar, and many older ones do as well (for instance, the ++ operator in C and C++ is syntactic sugar for simple incrementation). Encapsulation can be attained by grouping things by file (this may not literally make a variable private, but private variables themselves are artificially limited variables that the compiled program knows nothing about). Arrays, structs, tuples, dictionaries, and other data structures can be used to group data, and most languages have some tuple-like mechanic for grouping heterogeneous data. In cases where part of the object paradigm makes sense, but not all of it, it is often possible to get the benefits of the parts you need without actually using objects. This may not always make sense to do, but when it does, it is often better than paying full price for objects when you only need part of them.

Objects and Object Oriented Programming can be very useful in designing and writing applications, but they should never be treated as a complete programming style. Like any other data structure, objects have their place and in their place provide very valuable benefits. Again, like other data structures, overuse of objects results in programs that are inefficient and that do not make logical sense. Objects in programming were designed to model how we imagine real world objects to be. Computers do not think in objects though, and some parts of programming will never fit an object model. Trying to force those things into an object model will ultimately come with extra costs in time, money, and performance. OOP can be very valuable when used properly, but it can cost when it is overused or misused.

Saturday, August 16, 2014

SteamDex: PCB (Printed Circuit Board)

This post will chronicle the PCB drafting, prototyping, and testing. I will add to it as things get done (probably synchronized with the progress updates).

Saturday August 16th, 2014

Today, I finished the initial PCB draft. This required deciding what pins on the microcontroller would be used for which hardware interfaces. The PCB is designed to be mounted on the back of the LCD unit. There are 10 pins in the interface between the two, and I figured that they would be sufficient to handle the weight of the PCB with the other components. The mounting holes for the hardware are on the LCD unit. In theory, if the LCD is mounted to the enclosure, the PCB will not need any explicit mounting because it is firmly attached to the back of the LCD via the 10 pins and the associated solder joints. The two pieces will be connected with a set of header pins.

Also, I was able to arrange the components on the board close enough together to reduce the board foot print. This is great, because it will reduce the cost of the boards. It also means that the device will fit in smaller enclosures. The hardest part of this was routing all of the traces without any vias (it was not mandatory, but I kind of took it as a challenge). At first, I had the LCD module backwards, and the routing was really easy, but once I realized my mistake and flipped it, it became an awful mess. I am so glad I can do this on a two sided board! (The boards I drafted for classes were only one sided.)

Here are some screenshots of my work:

Electrical Schematic

PCB Draft

Notice all of that empty space on the left side of the PCB draft? That is about 1/5th of the width I originally allocated to the board (that could amount to a substantial cost savings). Of course, I will be eliminating that wasted space before sending this to a commercial PCB maker. I am also considering adding a ground plane to at least one side of the board. You might notice that some of the silk screen (the text and outlines) stuff overlaps a lot. This will make no difference as far as functionality goes, but I also plan on doing something about it before sending it to a commercial PCB maker. (I'll probably reduce the font sizes first and then move the BATT text to the top of the battery module and the U0 text to the bottom of the voltage regulator module (LD33V).

(In case you didn't already know, all of these files will be released under an open source license at the conclusion of the project. Also, in case you were about to ask, I did all of this in KiCad. KiCad is an open source PCB drafting suite. I don't know how it compares to commercial software like Eagle, except that free is way cheaper, especially for commercial use.)

Here is my todo list on the PCB. As items on the list get done, they will be removed and I will add anything interesting to the above section.

Order prototyping parts (photosensitive board, developer solution, and etchant; this is waiting for Kickstarter funding)
Create rough homemade board prototype
Make any necessary revisions (and maybe re-prototype)
Send out revised design to be prototyped by a commercial PCB maker
Make any necessary revisions (and maybe re-prototype)
Get estimated for mass production of boards from a medium scale PCB maker

Wednesday, August 13, 2014

SteamDex: Introduction

Link to the Kickstarter: https://www.kickstarter.com/projects/297892314/steamdex

Table of Contents:

Introduction
PCB (Printed Circuit Board)

The SteamDex is a simple electronic Steampunk device. It works similarly to a rolodex, but it can display nearly any textual data the user desires, so long as it is in chunks that will fit on the screen. It may be able to display small images, however, this will depend on available resources. In short, it will be a small, hand held, Steampunk device that can do something useful.

The SteamDex does not yet exist as a physical device. It is currently a figment of my imagination. Of course, I already have most of the parts necessary to construct one, and I have the required knowledge to program the device to do the things stated above. The only thing I am missing is time and a little bit of money for the few parts I do not already have.

SteamDex is a Kickstarter project that will be launched sometime within the next week. The project goal of $10,000 will provide enough funding to pay for the rewards, some extra parts to make up what I am missing, and the time required to create the software and design the hardware necessary to build a working SteamDex device.

How far along is the project currently? Well, I already have substantial experience with the MSP430 microcontroller that will be the heart of the device. I also have some experience interfacing the MSP430 with an LCD screen very similar to the one used by the project. Perhaps most importantly (because it is so time consuming), I have already coded a simple 8x8 ASCII font (extended ASCII range included) to use with the device.

What is left to complete the project? The first thing I will have to do is write a driver for the LCD screen. I have done this before, for a very similar screen, and I already have some sample code for the initialization phase. In other words, this will be fairly easy. The second part will be creating a user interface. This will include arranging the visuals and figuring out how the buttons (there will be six of them) will interact with them. The third part will be writing a driver for communicating with the microSD card, where the data set will be stored. Somewhere in there I will also have to design a printed circuit board for the device (this will be trivial). Once these are finished, if enough funding is raised to give me the time, I will add an image loader that will be able to display small images on the screen. This will allow images to be added to the data (you could even make a data set that turns the device into a Pokedex).

What is the point? Did I mention that the SteamDex is a fully functional, useful Steampunk device? I have done a lot of research involving looking at pictures and descriptions of Steampunk devices. There are some really awesome Steampunk things out there. A few of them even do what they look like they should do. Most, however, are just non-functional props and art objects. While this is probably acceptable for most Steampunk weapons, I want to see more Steampunk devices that have real value outside of conventions and other Steampunk gatherings. The SteamDex is my first offering of Steampunk devices that actually work. The SteamDex will be a fusion of art and function that you might just want to use in real life.

What is in it for backers? Well, first and foremost, backers get the good feeling you get when you contribute to an open source project that could benefit millions of people. For backers that contribute more than a few dollars, there are plenty of other rewards. The rewards include a preprogrammed MSP430 microcontroller that can be used to build your own SteamDex (or you can just use it as a conversation piece for bragging rights), a printed circuit board that can be used in constructing a SteamDex, an entire kit for constructing a SteamDex, a partially constructed SteamDex device, and a fully constructed SteamDex device. Also, all backers will receive the source code for the software and the design documents for the hardware for the SteamDex, as well as assembly instructions.

Is there some way I can help other than contributing money? Oh yes there is! Do you have friends that might be interested in contributing money? Maybe you subscribe to a popular Steampunk blog, magazine, or other publication. If you can find contact information for anyone that publishes anything about Steampunk stuff, please tell them about the Kickstarter. If you have followers on Twitter or friends on Facebook, post links to here and to the Kickstarter project once it launches. Even if you cannot provide funding, you can help by spreading awareness, and you can even still benefit because the project is open source and its success will benefit anyone who wants to take advantage of the information.

Once it has launched, the SteamDex Kickstarter will last only 30 days. At minimum funding, the project will probably be finished January or February 2015, with the last rewards shipping April 2015. With higher funding, some parts of the project can go much faster (I can cut my hours at my current job if the funding provides enough to pay the difference). I cannot predict exactly how fast the project will get done for a specific amount of funding above the goal, but I can say that some parts, for instance the LCD screen driver, could be finished in less than half the allotted time if I can spend even a few more hours a week on them. (Programming is funny that way. If there are fewer distractions and you can spend longer runs of time on it, you can increase productivity by far more than the proportion of time added.)

I sincerely hope this project is well funded, because any leftover funding at the end will give me time to work on other Technium Adeptus projects. While I am very excited about this project, I have other projects for Technium Adeptus that will do far more for helping its goals than this project will. When this project is completed, I hope to be able to put another Technium Adeptus project on Kickstarter and another after that, all with the goal of making modern technology more accessible to normal people and of increasing the rate of technological progress by supporting and producing open source software and hardware.

Monday, February 3, 2014

Intro to Technium Adeptus: Magic

If you ask a historian what magic is, you will probably be told that magic is a word describing anything we do not understand. For example, a thousand years ago, electricity would have been called magic, cars would have been magic, and computers would have been magic. There is a fundamental flaw in this argument and even in the common analogies. A thousand years ago, electricity, cars, and computers were not called magic. "Would have been" is worthless when talking about the semantics of language, because there is no proof. A thousand years ago, the word "magic" (or the appropriate translation, since modern English did not exist) referred to things that people actually observed. It was also sometimes used in reference to things they imagined. Magic is not a word used to describe something that has never been observed or imagined. It is used to describe things that have been observed or imagined. Now, people can imagine things that they have not observed, however they can almost never imagine something that is discovered in the future, accurately. For instance, I believe it was H.G. Wells who wrote about space travel, but the ship was some kind of wooden boat. This is imagined magic. The real thing is a metal spaceship that makes copious amounts of fire and smoke. Electricity, cars, and computers had never even been imagined a thousand years ago, as such. The things that were magic a thousand years ago were things like divination (still considered magic by many today) and gunpowder. A few hundred years before that gunpowder was not magic though, because it did not exist. Electricity was not magic a thousand years ago, because you cannot call something magic if you cannot at least imagine it. In fact, nothing has ever been called magic before it was observed either in reality or in imagination. This leads to a different conclusion. Gunpowder was magic a thousand years ago, because most people did not understand it. Some people must have understood it though, otherwise it would not have existed to be called magic. So, something is magic if there are people who understand it, but they are very few.

Now, you might ask about natural events. For instance, is lightning not magic? What is the difference between magic, nature, and acts of God? Historically, lightning was not magic. Rather, it was just an act of God or nature. Likewise, rains and floods were not magic. The difference is the agent. If something happens without human intervention, it is not magic. An eclipse is the wrath of an angry God or a natural phenomenon, not magic. Now, if a human could convince people that he or she was responsible for the eclipse, then it would be called magic. Otherwise, though, it was either nature or God causing it through whatever holy or unholy power. So again, magic is something that people understand, but the people who understand it are very few. It is also something that is caused by human action (or at least perceived to be so).

Now that we have had this philosophical discussion on the definition of magic, I want to apply this definition to modern times. What things exist today that are understood, but only by a very small percentage of the population? Most of modern technology, it turns out. Electricity is still magic, even though it was discovered and harnessed hundreds of years ago. Computers and computer programming are magic. Most people do not understand how an internal combustion engine works, and modern cars have embedded computers, so cars are magic. Mechanical engineering is magic. Physics is magic and especially quantum physics (string theory is not magic because no one really understands it, or maybe it is the imaginary kind of magic because it has no "practitioners"). Biology and genetics are also magic. In fact, there is more magic today than there has ever been in the history of the world, because there has been no time in history where more people used technology that they understood less. So, modern technology is magic. Now out of the above examples, some things are well understood by those who understand them, and others are poorly understood by those who understand them. Electricity is fairly well understood. Computers and programming are fairly well understood, along with classical physics, cars, and mechanical engineering. Biology is not so well understood, and quantum physics and genetics are pretty poorly understood. All of these things have potentially powerful applications, but few of them are really accessible to the general public. Otherwise stated, part of the reason many of these things are not understood by most people is that they are very difficult to learn, reliable information is hard to fine, and they have prerequisites that must be met before they can be understood (for instance, mechanical engineering and quantum physics require advanced math). This is why most of them require at least four years of college to gain even a partial understanding. There are a few of them, however, that are not that hard to learn. Electricity and computer science are actually easy to gain a good foundation in. It is common for students in these fields to already have prior experience in them, which they gained through the use of tutorials found on the internet (this is, in fact, how I gained my foundation in electrical engineering). One of the reasons many people find these subjects to be difficult is that tutorials, and often even college classes, are poorly designed for people who have no prior experience. Technium Adeptus, aims to solve this problem.

Our goal is to teach people magic. Instead of starting with tons of theory that is hard to understand without any practical experience with technology, we are going to start with application. Our tutorials/lessons will be designed in more of a walkthrough style that explains bits of theory as they are used. This will allow people to successfully complete projects without necessarily understanding exactly how everything works, but it will still provide the opportunity to learn. Because electrical components are reusable and the software we will be using is free, people who feel they need to go through a lesson multiple times to improve their understanding of the theory will not have any extra costs besides time (another reason electricity and computer programming are easier to learn). Our instructions may sometimes be a bit involved, but following them correctly will lead to correct results.

We believe that this learning style will be more effective because it is more natural. Most people seem to learn everyday things by following instructions and slowly picking up on how things work as they go. This learning style may not be suitable for formal learning like school because it tends to be slower, but it also seems to result in better long term memory and understanding of the subject. We hope that our readers will keep this in mind as they work through the tutorials. We also hope that they will experiment as they go and even work through some of the tutorials multiple times. Doing these things will help maximize the learning and retention. Our hope is that we can make learning magic easy enough that many people who might not otherwise even try will be able to learn with ease.

Wednesday, January 22, 2014

Technium Adeptus

Technically, Technium Adeptus is a social experiment that this blog is named after. This blog was created for two reasons. The first is that a friend suggested that it would be a good idea for me to start a technical blog, because it might improve my employability. I do a lot of technical projects, and writing about them would not only be useful for employment prospects, but it could help a lot of people with similar projects (I seem to pick projects where it is very difficult to find good information). The second is the social experiment. I am not going to discuss the experiment in any detail because it might taint the data. The reason this will be useful for it is that I have started writing two books that I plan to publish, and this would be a good place to discuss things beyond the topics of the books that might be useful for those who are using the books.

Technium Adeptus is a pseudo religious pro-technology meme. Technium represents technology, and Adeptus represents an organization or hierarchy of those who work to advance technology in a way that encourages free exchange and use of technological knowledge. Technium Adeptus supports open source software and hardware. For any more information, you will have to read the book once I finish it.

All of the books of the Technium Adeptus will have some focus on teaching how to create and control technology. The first will include tutorials on basic electronics and programming. The second one that I am currently working on will focus on programming an ARM processor designed for embedded applications. The third, which has not been started, will focus on programming a far simpler embedded processor, with some focus on assembly language programming. Since the later two of these will use an open source Linux development environment, one of my next posts will be instructions for installing Linux on a VirtualBox installation and getting the USB pass-through setup for loading programs onto the microprocessors.

One of my goals with Technium Adeptus and the associated books is to make creation and control of technology more accessible to normal people. I am spending extra effort on explaining complicated processes in more detail, and in simpler terms, so that less technologically literate people can understand more easily. I am also trying to format much of my material in more of a walk-through fashion, with fewer assumptions about what readers may already know. As such, if my readers find anything confusing, I hope they will comment. I do not typically approve this kind of comment for public viewing (to avoid clutter), but I will read the comments and try to fix any problems they point out. I am not opposed to fixing issues in posts even after they are several years old.

I plan on using a similar post naming scheme to my Makeshift Technology blog. Blog posts that are part of a series on a particular subject will be named with the subject name, followed by a colon, followed by a subtitle for the post. Maybe this will make it easier for people following specific projects.

I hope the knowledge I provide here finds its way to those who are looking for it. Enjoy!