tag:blogger.com,1999:blog-81627985947858882642024-03-13T11:03:41.067-07:00Map ReduceMap things, reduce and show it!Lazar Laszlohttp://www.blogger.com/profile/03228275955602408884noreply@blogger.comBlogger6125tag:blogger.com,1999:blog-8162798594785888264.post-27685215194235645102010-06-10T06:35:00.000-07:002010-06-10T07:40:10.748-07:00Interesting Internet Movie Database statistics - in PythonIn one of my <a href="http://map-reduce.blogspot.com/2010/05/python-episode-2.html">previous posts</a> I presented how to load the database files from IMDB in the Python shell.<br /><br />In the same way not only the release year but other information can be loaded, like the language, genre, ratings, country, etc.<br /><br />To plot graphics in Python you can use the <a href="http://matplotlib.sourceforge.net/">matplotlib</a>. To use this library you will need the <a href="http://numpy.scipy.org/">numpy</a> package too.<br /><br />All the functions used to extract info's from the loaded database can be found at the end of the post.<br />You can download the full code to load the database and make the queries from: <a href="http://www.miner-mole.com/download/imdb.py">imdb.py</a> and <a href="http://www.miner-mole.com/download/query.py">query.py</a><br /><br />Lets obtain the number of movies by year:<br /><div style="background-color:#e0e0e0;"><br />> MbY = query.MoviesByYear(imdb.Movies)<br /></div><br /><br />To plot the resulting data:<br /><div style="background-color:#e0e0e0;"><br />> from pylab import plot,show,legend<br />> plot(MbY.keys(), MbY.values())<br />> show()<br /></div><br /><br /><a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEi6TwSfdEnROK6y_eK8mVpyvwUQ86PiXH4tZufb7pRjhhyphenhyphenXPISIVzf44Th6i8206TI2wyfCOoHGw4rAPbPH_hH0TjGCVyS2vV-pKAZwSPlC1l_4K5jXB-gjEiKDz9Q-_f91sT9yGxllynOz/s1600/imdb1.jpg"><img style="display:block; margin:0px auto 10px; text-align:center;cursor:pointer; cursor:hand;width: 400px; height: 303px;" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEi6TwSfdEnROK6y_eK8mVpyvwUQ86PiXH4tZufb7pRjhhyphenhyphenXPISIVzf44Th6i8206TI2wyfCOoHGw4rAPbPH_hH0TjGCVyS2vV-pKAZwSPlC1l_4K5jXB-gjEiKDz9Q-_f91sT9yGxllynOz/s400/imdb1.jpg" border="0" alt=""id="BLOGGER_PHOTO_ID_5481137664385565938" /></a><br /><br />Now lets see the number of movies by countries:<br /><div style="background-color:#e0e0e0;"><br />> MC = query.ByCountry(imdb.Movies)<br />> MC[0:10]<br />[('USA', 328177),<br /> ('UK', 64717),<br /> ('France', 38066),<br /> ('Germany', 31408),<br /> ('Japan', 28819),<br /> ('Canada', 24745),<br /> ('Italy', 23877),<br /> ('India', 23687),<br /> ('Spain', 18313),<br /> ('Mexico', 17544)]<br /></div><br /><br />Plot the movie count for USA by year:<br /><div style="background-color:#e0e0e0;"><br />> USA = query.CountryByYear(imdb.Movies, 'USA')<br />> plot(USA .keys(), USA .values())<br />> show()<br /></div><br /><br /><a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiwd5JPiLv79Ipl7ROTL5U35w_1B-VnOPBYTf5i6IVLU9k4z-PKC6WCZRtn2a3sjCTFXqf4gAq2O3JXXGZARxcoaow5Ai0liJCzLgGzM_LyaTDC7rKpnUwRNJs_bq2J9rdT7Ym6UkSGgDvB/s1600/imdb2.jpg"><img style="display:block; margin:0px auto 10px; text-align:center;cursor:pointer; cursor:hand;width: 320px; height: 242px;" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiwd5JPiLv79Ipl7ROTL5U35w_1B-VnOPBYTf5i6IVLU9k4z-PKC6WCZRtn2a3sjCTFXqf4gAq2O3JXXGZARxcoaow5Ai0liJCzLgGzM_LyaTDC7rKpnUwRNJs_bq2J9rdT7Ym6UkSGgDvB/s320/imdb2.jpg" border="0" alt=""id="BLOGGER_PHOTO_ID_5481140062926722034" /></a><br /><br />Now plot more countries on the same figure:<br /><div style="background-color:#e0e0e0;"><br />> UK = query.CountryByYear(imdb.Movies, 'UK')<br />> France = query.CountryByYear(imdb.Movies, 'France')<br />> Germany = query.CountryByYear(imdb.Movies, 'Germany')<br />> Japan = query.CountryByYear(imdb.Movies, 'Japan')<br />> Canada = query.CountryByYear(imdb.Movies, 'Canada')<br />> p1=plot(UK.keys(), UK.values())<br />> p2=plot(France.keys(), France.values())<br />> p3=plot(Germany.keys(), Germany.values())<br />> p4=plot(Japan.keys(), Japan.values())<br />> p5=plot(Canada.keys(), Canada.values())<br />> show()<br />> legend( (p1, p2, p3, p4, p5), ('UK', 'France', 'Germany', 'Japan', 'Canada'), 'upper left', shadow=True)<br /></div><br /><br /><a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiB0JJdbFC3jffBttFshx_wUiIWtLIaO9cJoi7aUfpeM2lPCFjsjzYn51OOTAp0VRY0HoWHLnBzffdv_Um6xePWwNFU0hmtDXqbatJ_bqW3-wwfjlixM59ljdtqHhk-w01SBOph6cirjder/s1600/imdb3.jpg"><img style="display:block; margin:0px auto 10px; text-align:center;cursor:pointer; cursor:hand;width: 400px; height: 302px;" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiB0JJdbFC3jffBttFshx_wUiIWtLIaO9cJoi7aUfpeM2lPCFjsjzYn51OOTAp0VRY0HoWHLnBzffdv_Um6xePWwNFU0hmtDXqbatJ_bqW3-wwfjlixM59ljdtqHhk-w01SBOph6cirjder/s400/imdb3.jpg" border="0" alt=""id="BLOGGER_PHOTO_ID_5481142887967445282" /></a><br /><br />For Germany the movie count is 0 between 1950 and 1989 because the country was divided into East and West Germany.<br /><br />Now lets see the same plots for movie count by languages:<br /><div style="background-color:#e0e0e0;"><br />> BL = query.ByLanguage(imdb.Movies)<br />> BL[0:10]<br />[(u'English', 409215),<br /> (u'Spanish', 50291),<br /> (u'German', 43118),<br /> (u'French', 35512),<br /> (u'Japanese', 26340),<br /> (u'Italian', 22422),<br /> (u'Portuguese', 9902),<br /> (u'Hindi', 8362),<br /> (u'Dutch', 8161),<br /> (u'Russian', 8131)]<br />> Eng = query.LangByYear(imdb.Movies, 'English')<br />> Sp = query.LangByYear(imdb.Movies, 'Spanish')<br />> Ger = query.LangByYear(imdb.Movies, 'German')<br />> Fr = query.LangByYear(imdb.Movies, 'French')<br />> Jp = query.LangByYear(imdb.Movies, 'Japanese')<br />> p1=plot(Eng.keys(), Eng.values())<br />> p2=plot(Sp.keys(), Sp.values())<br />> p3=plot(Ger.keys(), Ger.values())<br />> p4=plot(Fr.keys(), Fr.values())<br />> p5=plot(Jp.keys(), Jp.values())<br />> show()<br />> legend( (p1, p2, p3, p4, p5), ('English', 'Spanish', 'German', 'French', 'Japanese'), 'upper left', shadow=True)<br /></div><br /><a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjcjNQchd5QbZYcdnuxz5-ZajEhZSzgdfAFUXUWsSzeCEjYjaJshDZrGmN0cwvBQ9zW66e1PFZWQogPRV5ZDVfME5ABPBe2gtLODwiLSVmNpiJ29fCEOBAX4_y1fV8LJK4e0sCR5azvy0ob/s1600/imdb4.jpg"><img style="display:block; margin:0px auto 10px; text-align:center;cursor:pointer; cursor:hand;width: 400px; height: 303px;" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjcjNQchd5QbZYcdnuxz5-ZajEhZSzgdfAFUXUWsSzeCEjYjaJshDZrGmN0cwvBQ9zW66e1PFZWQogPRV5ZDVfME5ABPBe2gtLODwiLSVmNpiJ29fCEOBAX4_y1fV8LJK4e0sCR5azvy0ob/s400/imdb4.jpg" border="0" alt=""id="BLOGGER_PHOTO_ID_5481149383992782978" /></a><br /><br />Find bellow the simple functions used to extract information for statistics.<br /><div style="background-color:#e0e0e0;"><br /><p><br /># Number of movies by year:<br />def MoviesByYear(i):<br /> data={}<br /> for k,v in i.iteritems():<br /> if v.has_key('year') and v['year'].isdigit():<br /> if data.has_key(int(v['year'])) :<br /> data[int(v['year'])] = data[int(v['year'])] + 1<br /> else:<br /> data[int(v['year'])] = 1<br /></p><p><br /># Number of movies by year per country:<br />def CountryByYear(i,country):<br /> data={}<br /> for k,v in i.iteritems():<br /> if v.has_key('country') and v['country']==country:<br /> if v.has_key('year') and v['year'].isdigit():<br /> if data.has_key(int(v['year'])) :<br /> data[int(v['year'])] = data[int(v['year'])] + 1<br /> else:<br /> data[int(v['year'])] = 1<br /> return data<br /></p><p><br /># Number of movies by language<br />def languagesort(x,y):<br /> if x[1]>y[1]:<br /> return -1<br /> if x[1]<y[1]:<br /> return 1<br /> if x[1]==y[1]:<br /> return 0<br /> <br />def ByLanguage(i):<br /> data={}<br /> for k,v in i.iteritems():<br /> if v.has_key('language') :<br /> if data.has_key(v['language']) :<br /> data[v['language']] = data[v['language']] + 1<br /> else:<br /> data[v['language']] = 1<br /> <br /> ll = map(lambda (k,v): (k,v),data.items())<br /> ll.sort(cmp = languagesort)<br /> return ll<br /></p><p><br /># Number of movies by country<br />def ByCountry(i):<br /> data={}<br /> for k,v in i.iteritems():<br /> if v.has_key('country') :<br /> if data.has_key(v['country']) :<br /> data[v['country']] = data[v['country']] + 1<br /> else:<br /> data[v['country']] = 1<br /> <br /> ll = map(lambda (k,v): (k,v),data.items())<br /> ll.sort(cmp = languagesort)<br /> return ll<br /></p><p><br />def LangByYear(i,lang):<br /> data={}<br /> for k,v in i.iteritems():<br /> if v.has_key('language') and v['language']==lang:<br /> if v.has_key('year') and v['year'].isdigit():<br /> if data.has_key(int(v['year'])) :<br /> data[int(v['year'])] = data[int(v['year'])] + 1<br /> else:<br /> data[int(v['year'])] = 1<br /> return data<br /></p><br /></div>Lazar Laszlohttp://www.blogger.com/profile/03228275955602408884noreply@blogger.com1tag:blogger.com,1999:blog-8162798594785888264.post-84037102736141252152010-05-27T06:09:00.000-07:002010-05-27T08:20:17.283-07:00Inclinometer in Python, for Symbian phones with accelerometerAnother example of how easy and fast applications can be developed in Python.<br /><br />An example of a simple inclinometer (tilt meter, tilt indicator, slope alert, slope gauge, gradient meter, gradiometer, level gauge, level meter, declinometer, and pitch & roll indicator).<br /><br />But first you need to install <a href="https://garage.maemo.org/frs/?group_id=854">Python for Symbian S60</a>. This can be installed on phones with Symbian OS 3rd or 5th edition. Download PyS60 binaries and install the runtime and the shell (copy the .sis files to phone and launch to install or install with your phones software).<br />The setup for Windows OS is for creation of installable packages from your Python application.<br /><br />The code is the following:<br /><div style="border:1px solid gray;background-color:#f0f0f0;"><br />from sensor import *<br />import e32<br />from appuifw import *<br />from random import randint<br /><br />print "Accelorometer by Lazar Laszlo (c) 2009"<br /><br /># Define exit function<br />def quit():<br /> App_lock.signal()<br />app.exit_key_handler = quit<br /> <br />app.screen = 'large' # Screen size set to 'large'<br />c = Canvas()<br />app.body = c<br />s1,s2=app.layout(EScreen)<br />mx = s1[0]<br />my = s1[1]<br />m2x = mx/2<br />m2y = my/2<br />sleep = e32.ao_sleep<br /><br /># Function which draws circle with given radius at given co-ordinate<br />def circle(x,y,radius=5, outline=0, fill=0xffff00, width=1):<br /> c.ellipse((x-radius, y-radius, x+radius, y+radius), outline, fill, width)<br /> <br /><br />class Inclinometer():<br /> def __init__(self):<br /> self.accelerometer = \<br /> AccelerometerXYZAxisData(data_filter=LowPassFilter())<br /> self.accelerometer.set_callback(data_callback=self.sensor_callback)<br /> self.counter = 0<br /><br /> def sensor_callback(self):<br /> # reset inactivity watchdog at every 20th read<br /> if self.counter % 20 == 0:<br /> e32.reset_inactivity()<br /><br /> # redraw at every 5th read<br /> if self.counter % 5 == 0:<br /> c.clear()<br /> circle(m2x+self.accelerometer.x*2, 160-self.accelerometer.y*2, 7, fill=0x0000ff)<br /> if self.accelerometer.z > 0:<br /> c.rectangle((0,m2y,15,m2y+self.accelerometer.z*2),fill=0x00ff00)<br /> if self.accelerometer.z < 0:<br /> c.rectangle((0,m2y+self.accelerometer.z*2,15,m2y),fill=0x00ff00)<br /> c.line((0,m2y,mx,m2y),outline=0,width=1)<br /> c.line((m2x,0,m2x,my),outline=0,width=1)<br /> self.counter = self.counter + 1<br /><br /> def run(self):<br /> self.accelerometer.start_listening()<br /><br />if __name__ == '__main__':<br /> d = Inclinometer()<br /> d.run()<br /> App_lock = e32.Ao_lock()<br /> App_lock.wait() # Wait for exit event<br /> d.accelerometer.stop_listening()<br /> print "Exiting Accelorometer"<br /></div><br /><br />The <b>appuifw</b> module contains the functions and objects for the graphical user interface. You can set the applications window size with the <b>app.screen</b> variable. To use the whole screen as a drawing canvas set <b>app.screen = 'large'</b> and set the application body to the <b>Canvas</b> object.<br /><br />The application gets the accelerometer data through a callback. In order to do not redraw the screen at every read, a simple counter is used. I doubled the accelerometers values to increase the circles movement. The x and y values are displayed as a circle centered to the screens center. The z value is displayed as a green bar.<br /><br />I reset the phones inactivity watchdog with the <b>e32.reset_inactivity()</b> function to keep the back-light on while the application is running.<br /><br />If your phone has a magnetometer (compass) you can switch the sensor from <b>AccelerometerXYZAxisData</b> to <b>MagnetometerXYZAxisData</b> and the <b>self.accelerometer</b> variables to <b>self.magnetometer</b> to display the direction to North.<br /><br />To run the application just save the code to a text file with .py extension, copy the file to the phones \DATA\PYTHON directory (or into the \PYTHON directory in the phone memory).<br /><br />Screen shots from my Nokia E52:<br /><br /><a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiwc6jlHJXhtTDlOSxRsDOE8ig8DsHtJXPvr01eBgB-KHxaFkCZdHgLsMESpiTf1bCO7nmto2eM4AWHqkUVbvRZrqe-zWwjrGXZmcV87p0UdLVqTOC0ofGCEioSNgVIK_BWLIbvOL_aezFa/s1600/p1.jpg"><img style="block:right; margin:0 0 10px 10px;cursor:pointer; cursor:hand;width: 240px; height: 320px;" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiwc6jlHJXhtTDlOSxRsDOE8ig8DsHtJXPvr01eBgB-KHxaFkCZdHgLsMESpiTf1bCO7nmto2eM4AWHqkUVbvRZrqe-zWwjrGXZmcV87p0UdLVqTOC0ofGCEioSNgVIK_BWLIbvOL_aezFa/s400/p1.jpg" border="0" alt=""id="BLOGGER_PHOTO_ID_5475963371491605602" /></a><br /><br /><a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgbAIx2nu2QULvD046rGG5w2PkLrMaovuoakqkkrB4hMZqKXdSPaTvFMhlbjzKDNllh-09QfGjwv_mCrguaXwEF1VCmZQu5Zmf39vEi5mwSTiUuFgPS3lZ0liHzYrVhB0pmSN1-_EHm4hyphenhyphenN/s1600/p2.jpg"><img style="block:right; margin:0 0 10px 10px;cursor:pointer; cursor:hand;width: 240px; height: 320px;" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgbAIx2nu2QULvD046rGG5w2PkLrMaovuoakqkkrB4hMZqKXdSPaTvFMhlbjzKDNllh-09QfGjwv_mCrguaXwEF1VCmZQu5Zmf39vEi5mwSTiUuFgPS3lZ0liHzYrVhB0pmSN1-_EHm4hyphenhyphenN/s400/p2.jpg" border="0" alt=""id="BLOGGER_PHOTO_ID_5475963425816241890" /></a><br /><br><br /><a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEidUn2P4Fio0Z4wGWcfZDIS7vcnVemaBk38ojdCvmXNQtJPTM4JA4cc905VZBJvu1s781C70uj-l0vhxlov6bJZUVMvoFLsALAOYQE_z5npC5Kh53sePYPf2nQ1vzt_UQ0Mh2jcKhY9Ci1a/s1600/p3.jpg"><img style="block:right; margin:0 0 10px 10px;cursor:pointer; cursor:hand;width: 240px; height: 320px;" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEidUn2P4Fio0Z4wGWcfZDIS7vcnVemaBk38ojdCvmXNQtJPTM4JA4cc905VZBJvu1s781C70uj-l0vhxlov6bJZUVMvoFLsALAOYQE_z5npC5Kh53sePYPf2nQ1vzt_UQ0Mh2jcKhY9Ci1a/s400/p3.jpg" border="0" alt=""id="BLOGGER_PHOTO_ID_5475963596049515906" /></a><br /><br />Feel free to use and play with this small code.Lazar Laszlohttp://www.blogger.com/profile/03228275955602408884noreply@blogger.com0tag:blogger.com,1999:blog-8162798594785888264.post-15215361314783632872010-05-24T08:20:00.000-07:002010-05-24T10:54:58.763-07:00Python, episode 2.Now back to Python.<br /><br /><a href="http://www.addedbytes.com/cheat-sheets/python-cheat-sheet/">Here is a nice cheat sheet about some basic language features, commands, variables.</a><br /><br />In my first post about Python I presented the IPython shell add-on. It's very useful if you are doing some programming directly in the Python shell. You don't need to know all the functions from imported modules or objects, you can use the tab completion to quickly display all available members of an object or a module.<br />Example to find out the version of your Python, first import the "sys" module with:<h4>import sys</h4>Then type "sys." and then press the TAB button, now you will get a list of all available functions and member variables imported from the "sys" module. If you type "v" and press again TAB, you will get two options: sys.version and sys.version_info. The first option is already typed in the command line so you can press Enter to get the version information.<br />In Python everything is an object, every variable will have some member functions, for example a string:<br />mystring = 'My string'<br />Now if you type mystring.[TAB] you will get an impressive list of string processing functions.<br /><br />A simple example to show the power of Python is a text processing program.<br /><a href="http://www.imdb.com/">IMDB</a> (The Internet Movie Database) has a big database of movies and related information. You can <a href="http://www.imdb.com/interfaces">download the database</a> as text files, which are not really easy to process. There is a text file for each movie related type of information. To do some statistics and complex queries I had to load the data into the memory, in a search-able way. <br /><br />Here is the program:<br /><h5><br />import os<br />import string<br />import time<br />import sys<br /><br />try:<br />    if len(Movies) == 0:<br />        Movies = {}<br />except:<br />    Movies = {}<br /><br />def LoadMovies():<br />    global Movies<br />    filesize = os.path.getsize('movies.list')<br />    f=open('movies.list','rt')<br />    progress = [x * filesize / 100 for x in range(10,110,10)]<br />    start = False<br />    count = 0<br />    progressPos = 0<br />    lineNr = 0<br />    startTime = time.clock()<br />    print "0% ",<br />    for line in f:<br />        if lineNr%100 == 0 and f.tell() > progress[progressPos]:<br />            print str((progressPos + 1) * 10)+"% ",<br />            sys.stdout.flush()<br />            progressPos = progressPos + 1<br />        if start:<br />            ls = line.split('\t')<br />            if len(ls) > 1:<br />                moviename = unicode(string.strip(ls[0]),'latin_1')<br />                movieyear = string.strip(ls[-1])<br />                Movies[moviename]={'year':movieyear, 'genre':[]}<br />                count = count + 1<br />                if count == -1:<br />                    return<br />        else:<br />            if line.find('MOVIES LIST'):<br />                start = True<br />        lineNr = lineNr + 1<br />    print "100%\nLoaded",count,"entries."<br />    print "Done in ",time.clock() - startTime,"seconds."</h5><br />Now a quick description:<br />The file is opened for reading with the open command. The "movies.list" file contains movie titles and release years.<br />First I read the file size to display the progress while reading the file. Almost 20% of the code above is this progress indicator, because I had to optimize for speed. In order not to read the position and calculate the percentage every time, I have pre-calculated for every 10th percent the position in the file. Then for every 100th line I get the position in the file and compare with the "progressPos"th value in the table. The line where I calculated these values may look strange, but this is called in Python <b>"list comprehension"</b>. This is an expression followed by a "for" clause and then other "for" and "if" clauses.<br /><br />Example: [2**x for x in range(0,8)] will calculate the power of two from 0 to 8 and the result will be [1, 2, 4, 8, 16, 32, 64, 128].<br /><br />The <b>"for"</b> clause can be used for file reading too.<br />Because the text files from IMDB contain some other texts and details I had to jump over the lines until the actual list begins, for which I used the start variable.<br />Then I split every line by TAB character and I get the last word with the -1 position, because between the movie title and the movie year can be more TABs. This is another nice feature of the Python list indexing. In other languages you had to use the size-1 to get the last position, here you can use negative indexes.<br />I'm storing the title and year in a dictionary variable "Movies". Dictionaries are sometimes found in other languages as "associative memories" or "associative arrays". The C++ implementation is the "map" and HashTable in C#. The key will be the movie title and in the value another dictionary with key "year" and value the release year. This is because I will store later other information too.<br /><br />Now, if you put this little program in a text file called imdb.py and save it in the same directory where you downloaded and unpacked the movies.list file, you can run the program either directly executing the ".py" file if you registered this extension to python. Or you can start IPython then change the current directory to the one where the files reside with the "cd" command. An example session you will find below:<br /><br /><a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhHK3B8MLGujbmtxSCM_tKZckEoONsPmAyc54yb7hH65F61q3VsiFyGglMPeE2_YJqrxHqB8mRA1flCCqSqHXKdVcRI7K3abnYyHFzHap7KDiSHwXKirumRJdbrbjANeKun-mteBHpTPoB0/s1600/python_imdb.jpg"><img style="display:block; margin:0px auto 10px; text-align:center;cursor:pointer; cursor:hand;width: 400px; height: 202px;" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhHK3B8MLGujbmtxSCM_tKZckEoONsPmAyc54yb7hH65F61q3VsiFyGglMPeE2_YJqrxHqB8mRA1flCCqSqHXKdVcRI7K3abnYyHFzHap7KDiSHwXKirumRJdbrbjANeKun-mteBHpTPoB0/s400/python_imdb.jpg" border="0" alt=""id="BLOGGER_PHOTO_ID_5474880050462643618" /></a><br /><br />Now sample queries:<br />- get all the movies for year 2010: y2010 = [k for k,v in imdb.Movies.iteritems() if v['year']=='2010']<br />- the number of movies: len y2010<br />- the first and last movie: y2010[0], y2010[-1]<br /><br /><a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhXckxHFvWu8g2OsNXZT_9s7LyhvgEgoJNkvbrynXN1Rrc6zvmpvIStu_EDTZ0PKnWxg90LNwe_LjWOVGanO2VaRzPoUCJDJ-QGsqp-Nt0SA75-4pQd7XnccPgXeSZy2w-bxqqCrRJ4qIS7/s1600/movies_2010.jpg"><img style="display:block; margin:0px auto 10px; text-align:center;cursor:pointer; cursor:hand;width: 400px; height: 202px;" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhXckxHFvWu8g2OsNXZT_9s7LyhvgEgoJNkvbrynXN1Rrc6zvmpvIStu_EDTZ0PKnWxg90LNwe_LjWOVGanO2VaRzPoUCJDJ-QGsqp-Nt0SA75-4pQd7XnccPgXeSZy2w-bxqqCrRJ4qIS7/s400/movies_2010.jpg" border="0" alt=""id="BLOGGER_PHOTO_ID_5474892008011688706" /></a><br /><br />To search by movie title, example for movies with "Star Trek" in title:<br />star_trek = [k for k,v in imdb.Movies.iteritems() if k.find('Star Trek') > 0]<br /><br /><a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEip5LBT8EoEdUKHp3We7ImHlCYgonzpMfIG7Ol3WLP8dDNUwTjysD_EO3jdtiaZUzRWW_m0CafJsqx-9pgzkRkCsXYJJ5ckfrbPe1DNlw7sn20a102RpploXUE6roXmu5MqBbpS-TJqNw_V/s1600/star_trek.jpg"><img style="display:block; margin:0px auto 10px; text-align:center;cursor:pointer; cursor:hand;width: 400px; height: 202px;" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEip5LBT8EoEdUKHp3We7ImHlCYgonzpMfIG7Ol3WLP8dDNUwTjysD_EO3jdtiaZUzRWW_m0CafJsqx-9pgzkRkCsXYJJ5ckfrbPe1DNlw7sn20a102RpploXUE6roXmu5MqBbpS-TJqNw_V/s400/star_trek.jpg" border="0" alt=""id="BLOGGER_PHOTO_ID_5474896425179032066" /></a><br /><br />You will need at least 1G RAM because the movie database contains 1.6 million items.<br /><br />In the next post I will present the "MatPlot" library for Python and will make some nice graphics about movies by year, country, language, etc.<br /><br />To be continued ...Lazar Laszlohttp://www.blogger.com/profile/03228275955602408884noreply@blogger.com1tag:blogger.com,1999:blog-8162798594785888264.post-61446360304966238442010-05-18T01:46:00.000-07:002010-05-18T02:06:41.540-07:00Cheat sheetsNot the ones used by students without the instructor's knowledge to cheat on a test.<br /><br />These are simple pages to help you in your work by providing quick references for programming languages, tools, web technologies, command line options etc.<br /><a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEg7JWrVtLMWkUTso6BN4ypXfyj6mZLvCwbGpytlZ6EIfhJUiSqwy3TORBBwAsvbQvrSac2SrRmxeJW2djghpYWwUO_CNK5dztbPF0Ncr6Bq25vvYtKYI2Kh8U0ZDvbC9UHgXkTbC3QBqXXe/s1600/earth.png"><img style="float: right; margin: 0pt 0pt 10px 10px; cursor: pointer; width: 144px; height: 149px;" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEg7JWrVtLMWkUTso6BN4ypXfyj6mZLvCwbGpytlZ6EIfhJUiSqwy3TORBBwAsvbQvrSac2SrRmxeJW2djghpYWwUO_CNK5dztbPF0Ncr6Bq25vvYtKYI2Kh8U0ZDvbC9UHgXkTbC3QBqXXe/s320/earth.png" alt="" id="BLOGGER_PHOTO_ID_5472531167800431682" border="0" /></a><br />You can print them but to be environment friendly just save them locally or open directly from the web.<br /><br /><br /><br /><br />Or use an iPad :<br /><br /><a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjdjYMmdV7VZwWUl8IcoJlFXRsnWzz5kTJ7M0YYQyq2uvrUHwzXFkZxncCKnYrsqSUBwMeMHyM6WH4voJqJK6AYKmUwm_h40bEJai8LQ9_NuFan7P56jPd2gUw3ZdhdSDHmreYa935xJsIO/s1600/ipad_cheatsheet.jpg"><img style="display: block; margin: 0px auto 10px; text-align: center; cursor: pointer; width: 251px; height: 320px;" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjdjYMmdV7VZwWUl8IcoJlFXRsnWzz5kTJ7M0YYQyq2uvrUHwzXFkZxncCKnYrsqSUBwMeMHyM6WH4voJqJK6AYKmUwm_h40bEJai8LQ9_NuFan7P56jPd2gUw3ZdhdSDHmreYa935xJsIO/s320/ipad_cheatsheet.jpg" alt="" id="BLOGGER_PHOTO_ID_5472533161981589586" border="0" /></a><br /><br />Now the links:<br /><a href="http://www.cheat-sheets.org/">www.cheat-sheets.org/</a><br /><a href="http://www.addedbytes.com/cheat-sheets/">www.addedbytes.com/cheat-sheets/</a><br /><a href="http://packetlife.net/library/cheat-sheets/">packetlife.net/library/cheat-sheets/</a>Lazar Laszlohttp://www.blogger.com/profile/03228275955602408884noreply@blogger.com0tag:blogger.com,1999:blog-8162798594785888264.post-55374099531195154122010-05-17T08:28:00.000-07:002010-05-18T02:40:15.375-07:00PYTHON<a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgbuy-WI5se0dPyEkxyAWZQ6iGcWLqveZGFb1D__111SaHmv3vVeIsAb50XBtDVrrvy9LO4vninoY6_w2NdXg4Ai3D0kluzo3nP9wsN9KIHbEho1F-TNsI4g5Txf3Gnkf9d_R1LVJgK3jpN/s1600/2dkzXin.jpg"><img style="display: block; margin: 0px auto 10px; text-align: center; cursor: pointer; width: 300px; height: 225px;" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgbuy-WI5se0dPyEkxyAWZQ6iGcWLqveZGFb1D__111SaHmv3vVeIsAb50XBtDVrrvy9LO4vninoY6_w2NdXg4Ai3D0kluzo3nP9wsN9KIHbEho1F-TNsI4g5Txf3Gnkf9d_R1LVJgK3jpN/s320/2dkzXin.jpg" alt="" id="BLOGGER_PHOTO_ID_5472263059187733874" border="0" /></a><br /><h6>The Pythonidae, commonly known simply as pythons, from the Greek word python-πυθων, are a family of non-venomous snakes found in Africa, Asia and Australia. Among its members are some of the largest snakes in the world. Eight genera and 26 species are currently recognized. (long live Wikipedia)</h6><br /><br />Nice ...<br /><br />Well this is not about that beautiful snake. It's about the Python programing language my new passion. Beside other languages I currently use at work or at home I'm starting to love this simple yet powerful <b>interpreted</b> language.<br /><br /><a href="http://www.python.org/">You can find it here together with more information.</a><br /><br />What I like about it is the portability (there are implementations for Windows, Linux/Unix, Mac OS X, Symbian[mobile phones]) and the speed to develop small applications. No need for big development environments, for compiling etc. Just RUN. <br /><br />If you are lazy to write it into a text file you can just type it into the command interpreter. That's about the bold <b>interpreted</b> word above.<br /><br />I don't want to write about the language itself - you can find nice tutorials and Hello World apps on the net. I want to write about some modules I found useful.<br /><br />The easiest way to install modules is using the <a href="http://pypi.python.org/pypi/setuptools">setuptools </a> utility. You should download the binary package for Windows or the sources for Linux if the distribution doesn't have it already.<br />Then you will have the easy_install.exe program in the Pythonxy\Scripts directory.<br /><br />First, to enhance productivity, there is the <a href="http://ipython.scipy.org/">IPython</a> interactive computing environment.<br /><br />To install: First <b>easy_install pyreadline</b> then <b>easy_install ipython</b><br /><br />Or you can get the binary distribution from the Download page, but you will need the <a href="http://ipython.scipy.org/moin/PyReadline/Intro">pyreadline</a> module too. This nice package will give you some help in the command line interface, like tab completion (linux/unix like), history, colors, etc.<br /><br /><a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjiaCvaYw8LMaMyB_kgbIap0iQiBpg46TXkHOBhH3ae169eK8lRRUbQa3oBA96-5VLODE1nJHh66fHf2crFXS8bL0CXO2u1eXzld9jqv70tRGX5N6rm6rlZ-p1eLbWJuE9AuDPHU3JxodMO/s1600/ipython.jpg"><img style="display:block; margin:0px auto 10px; text-align:center;cursor:pointer; cursor:hand;width: 320px; height: 180px;" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjiaCvaYw8LMaMyB_kgbIap0iQiBpg46TXkHOBhH3ae169eK8lRRUbQa3oBA96-5VLODE1nJHh66fHf2crFXS8bL0CXO2u1eXzld9jqv70tRGX5N6rm6rlZ-p1eLbWJuE9AuDPHU3JxodMO/s320/ipython.jpg" border="0" alt=""id="BLOGGER_PHOTO_ID_5472274533780419874" /></a><br /><br />to be continued ...Lazar Laszlohttp://www.blogger.com/profile/03228275955602408884noreply@blogger.com0tag:blogger.com,1999:blog-8162798594785888264.post-78759497714263383622010-05-17T01:57:00.000-07:002010-05-17T10:54:04.593-07:00HTML5 - presentations, demos<a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgSTJGtc7xfcjT_z_WvTYcW0IovZFRBmy8FaHJgMqmYrhwbIS0HrQjT9LDLlfC2Bcw5Mz79M6mh-RUOBtlSsty0nY-4urY1PHoVkpMaqJt2r0EIHEGZWEeQgUv0w9FrbPY91hns5XSDkkV5/s1600/html5.jpg"><img style="display:block; margin:0px auto 10px; text-align:center;cursor:pointer; cursor:hand;width: 320px; height: 247px;" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgSTJGtc7xfcjT_z_WvTYcW0IovZFRBmy8FaHJgMqmYrhwbIS0HrQjT9LDLlfC2Bcw5Mz79M6mh-RUOBtlSsty0nY-4urY1PHoVkpMaqJt2r0EIHEGZWEeQgUv0w9FrbPY91hns5XSDkkV5/s320/html5.jpg" border="0" alt=""id="BLOGGER_PHOTO_ID_5472295757055292434" /></a><br /><br /><br />The future of WEB or the WEB of the future, hard to decide.<br /><br /><a href="http://apirocks.com/html5/html5.html#slide1">A nice presentation of the new html standards capability.</a><br />To view the presentation a HTML5 capable web browser is needed, like: <a href="http://www.mozilla.com/firefox/">Mozilla Firefox</a>, <a href="http://www.google.com/chrome">Google Chrome</a>, <a href="http://www.opera.com">Opera</a>, <a href="http://www.apple.com/safari/">Safari</a><br /><br /><a href="http://www.benjoffe.com/code/">Some nice applications of the HTML5/canvas element by Ben Joffe</a><br /><br /><a href="http://www.canvasdemos.com/">Applications, games, tools and tutorials for the HTML5 canvas element</a><br /><br /><a href="http://html5test.com">Test your browsers HTML5 support</a><br /><br /><a href="http://dev.w3.org/html5/spec/Overview.html">The HTML5 standard</a>Lazar Laszlohttp://www.blogger.com/profile/03228275955602408884noreply@blogger.com0