Transpose in Oracle

Transposing data means changing data from a row into a column. Starting from version 11, this is possible in Oracle as well. It is possible to translate some values that appear in rows into columns. Doing so, a new table can be created that has an additional set of columns with column names being derived… Read More »

Reading and writing in Java

Reading and writing from and to files is not easy in Java. This can already be seen if one simply googles on “Java Filereader problem”. This generated 477000 hits. Apparently, reading (and writing) is not trivial. I wrote a small program that is able to read a file and copy its contents to another file.… Read More »

Another set of keys and values

Another example on how mapper and reducers are used in a Hadoop context is given below. This programme is created as three classes. One class is an overall class that calls two other classes: a mapper class and a reducer. The mapper classer reads a file and creates a series of words. In the first… Read More »

Map and reduce – what happens?

In Big Data, the concept of mapping and reducing plays a huge role. The idea is that a a massive dataset is split over several servers. On each server, a part of the data is investigated. This part is called a mapper. In a subsequent part, these parts are merged into an outcome. This latter… Read More »

Hive – connecting from SQL Developer

In my impression, the big development that takes place now in the world of Big Data is the creation of connectors. Such connectors enable us to continue using standard tools (R for example) with the data being stored in Hadoop. I am very much impressed with Hive. Hive allows us to access data being stored… Read More »

R – the shortest name possible

For some reason, short names are popular as computer languages. Think of “C”. Another example is “R”. R reminds me a bit of Matlab; it is an easy to learn language with immense statistical possibilities. It is compared to nowadays giants like SAS. The advantage of R is that it is widely accepted by the… Read More »

Hive, SQL on Hadoop

In a previous post, I discussed the difficulty to use Hadoop with its Big Data structure. One must write two different Java programmes. One programme is a so-called mapping programme; another is the reduce programme.

Pig: yet another approach to handling big data

In another post, I discussed how Java can be used to analyse data in a Big Data environment. The problem then lies with Java itsself. Java is not a tool for the faint hearted; it is difficult. Moreover, one must comply with a structure where one must write two programme’s: a mapping programme and a… Read More »

Python: another language to access Big Data

In an earlier post, I showed how Java could be used to access Big Data. I also stated that I had many problems with Java itsself. I noted that I was not the only one to have issues with Java. A much easier language is Python. This language is really easy to learn and it… Read More »

Hadoop: my first java programme

Today, I created a Java programme to get myself acquainted with the usage of Hadoop. I took an existing java programme to start with. This existing programme can be found at ” https://github.com/tomwhite/hadoop-book/blob/master/ch02/src/main/java/OldMaxTemperature.java “. I tweaked this programme to adjust it to my existing situation.