Making Web Applications More Efficient with a Graph Database
This week, at the 38th International Conference on Very Large Databases—the premier database conference—researchers from MIT’s Computer Science and Artificial Intelligence Laboratory presented a new system that automatically streamlines websites’ database access patterns, making the sites up to three times as fast. And where other systems that promise similar speedups require the mastery of special-purpose programming languages, the MIT system, called Pyxis, works with the types of languages already favored by Web developers.
Pyxis solves all three problems. It automatically partitions a program between application server and database server, and it does it in a way that can be mathematically proven not to disrupt the operation of the program. It also monitors the CPU load on the database server, giving it more or less application logic to execute depending on its available capacity.
Pyxis begins by transforming a program into a graph, a data construct that consists of “nodes” connected by “edges.” The most familiar example of a graph is probably a network diagram, in which the nodes (depicted as circles) represent computers, and the edges (depicted as lines connecting the circles) represent the bandwidth of the links between them. In this case, however, the nodes represent individual instructions in a program, and the edges represent the amount of data that each instruction passes to the next.
“The code transitions from this statement to this next statement, and there’s a certain amount of data that has to be carried over from the previous statement to the next statement,” Madden explains. “If the next statement uses some variable that was computed in the previous statement, then there’s some data dependency between the two statements, and the size of that dependency is the size of the variable.” If the whole program runs on one computer, then the variable is stored in main memory, and each statement simply accesses it directly. But if consecutive statements run on separate computers, the data has to make the jump with them.