I spent a few days last week catching up on the Hadoop tech out there, and of all of the things I saw, Apache Pig impressed me the most. Essentially, Pig is a way to transform and explore data in a programming and SQL-like manner. Want to use external functions in your preferred programming language to manipulate data? Pig has UDF for that. Need to join data quickly? It has that, too. Want to run Pig on Spark? Get a Spork.