There are two ways to use python and pig.
Examples: We have a dump of data from seismic sensors. We need to find all the locations where there has been an earthquake of magnitude > 5 and we want the number of such quakes over the data. We filter data a UDF and we use another UDF to align data properly. Python UDFs are used are for sake of demo. The full code for the project is available here. Notice that everything is under one project. In the code there are two folders QuakeDataRunner and QuakeDataRunner2 which demonstrate both these approaches.
A) One is to seperate the python (UDF) and the Pig script that used the python UDFs.
In this case, import the UDF file into the Pig script using jython. Pig uses the internal Jython engine for this purpose. The files are shown below. First the Python UDF is as follows
The Pig script which uses the UDF above is as follows
Click for Code used this post
1. Programming Pig by Alan Gates.
2.Embedding Pig in scripting Languages by Julien Le Dem.