In my ideas for TED I mentioned that currently they don't give access to their API to new users. This should not necessarily stop us from getting the data from the web site. We can use the good old web scraping.

I've picked one of the videos almost randomly: The year open data went worldwide. If you look at the page you'll see that it has "32 subtitle languages" (or maybe more by the time you look at it). If you click on that text you'll see a modal display showing the list of the languages. Clearly it is some JavaScript code that generates this modal display.

I looked at the source of the page (just by right-clicking on the page in the browser) trying to locate the data. At first I searched for ""subtitle languages", but that did not lead me to the list of languages.

Then I searched for 'Chinese', the name of one of the translations I suspected won't show up in any other part of the page, and I found it embedded in a json structure inside a JavaScript function embedded in a <script> tag.

Equipted with information I started to write a small script that would fetch the page, find all the 'script' tags and print the content of these script tags. At first I used Web::Query to fetch the page, find the 'script' tags and extract their content. The first two steps when well, but the function I was expecting to extract the 'text' inside the 'script' tags did not return anything. So I filed a bug-report/question.

I did not want to wait for a reply so to have a faster solution I turned to Regular Expression. Normally parsing HTML using Regexes is considered a sin, but in this case we had to extract a single tag that did not have any other tags in it and is very unlikely to contain a string that looks like a closing 'script' tag.

examples/scraping_ted/parse_with_regex.pl

#!/usr/bin/perl
use strict;
use warnings;
use 5.010;

use LWP::Simple qw(get);

my $url = 'http://www.ted.com/talks/tim_berners_lee_the_year_open_data_went_worldwide';

my $html = get $url;

foreach my $script ($html =~ m{<script>(.*?)</script>}gs) {
    say $script;
    say '------';
}

I used the get function of LWP::Simple to fetch the page and then a regex to parse it and extract the content of every 'script' tag. In this regex I've use .*? a minimal match, and the s modifier to change the behavior of . to match any characters. Including newlines. The g modifier is only there to fetch globally, all the possible matches.

Extract JSON

The next step was to extract the JSON from within the 'script' tag. For this I had to use a Regex again as the JSON is embedded in a JavaScript function called 'q'. It looked something like this:

q("talkPage.init",({"talks" ... }))

except that there was a lot more code instead the 3 dots.

For this I used the following expression:

    my ($json) = $script =~ /^q\("talkPage\.init",(\{"talks".*)\)/s;

The left-hand side of the assignment must be in parentheses to create LIST context and we again use s modifier again to change the behavior of ..

There are many 'script' tags on this page, but only one is expected to match this regex and that is expected to return a JSON string. So we can skip the rest of the look if $json is empty.

If $json was not empty then we would like to convert it to a Perl data structure. For that we can use the decode_json function of any of the JSON modules. Resulting in this code:

examples/scraping_ted/extract_json.pl

#!/usr/bin/perl
use strict;
use warnings;
use 5.010;

use LWP::Simple qw(get);
use JSON qw(decode_json);
use Data::Dumper qw(Dumper);

my $url = 'http://www.ted.com/talks/tim_berners_lee_the_year_open_data_went_worldwide';

my $html = get $url;

foreach my $script ($html =~ m{<script>(.*?)</script>}gs) {
    my ($json) = $script =~ /^q\("talkPage\.init",(\{"talks".*)\)/s;
    next if not $json;
    my $data = decode_json $json;
    print Dumper $data;
    say '------';
}

Unfortunately running this script will throw an exception:

Wide character in subroutine entry at extract_json.pl line 18

I've already seen this problem once. I had to mark the JSON string to be real UTF-8 string using the encode function of the Encode module.

examples/scraping_ted/get_and_extract_json.pl

#!/usr/bin/perl
use strict;
use warnings;
use 5.010;

use LWP::Simple qw(get);
use JSON qw(decode_json);
use Data::Dumper qw(Dumper);
use Encode qw(encode);

my $url = 'http://www.ted.com/talks/tim_berners_lee_the_year_open_data_went_worldwide';

my $html = get $url;

foreach my $script ($html =~ m{<script>(.*?)</script>}gs) {
    my ($json) = $script =~ /^q\("talkPage\.init",(\{"talks".*)\)/s;
    next if not $json;
    my $data = decode_json encode('utf8', $json);
    print Dumper $data;
}

The output of this script looks like this:

examples/scraping_ted/json_dump.pl

$VAR1 = {
          'threadId' => 649,
          'relatedTalks' => [
                              {
                                'title' => 'The next web',
                                'slug' => 'tim_berners_lee_on_the_next_web',
                                'duration' => 983,
                                'speaker' => 'Tim Berners-Lee',
                                'image' => 'https://tedcdnpi-a.akamaihd.net/r/tedcdnpe-a.akamaihd.net/images/ted/77260_800x600.jpg?quality=75&w=500'
                              },
                              {
                                'slug' => 'melati_and_isabel_wijsen_our_campaign_to_ban_plastic_bags_in_bali',
                                'title' => 'Our campaign to ban plastic bags in Bali',
                                'image' => 'https://tedcdnpi-a.akamaihd.net/r/tedcdnpe-a.akamaihd.net/images/ted/0da6ace6197fc74eaf425c413eb5636d57e9891e_2880x1620.jpg?quality=75&w=500',
                                'duration' => 660,
                                'speaker' => 'Melati and Isabel Wijsen'
                              },
                              {
                                'slug' => 'auke_ijspeert_a_robot_that_runs_and_swims_like_a_salamander',
                                'title' => 'A robot that runs and swims like a salamander',
                                'image' => 'https://tedcdnpi-a.akamaihd.net/r/tedcdnpe-a.akamaihd.net/images/ted/0b0fb52e085bad1e7834a6bfcc93f27cba088559_2880x1620.jpg?quality=75&w=500',
                                'duration' => 850,
                                'speaker' => 'Auke Ijspeert'
                              },
                              {
                                'image' => 'https://tedcdnpi-a.akamaihd.net/r/tedcdnpe-a.akamaihd.net/images/ted/017abc101c829da234618637fdfbfd09eb296fba_2880x1620.jpg?quality=75&w=500',
                                'duration' => 1053,
                                'speaker' => 'Elizabeth Lev',
                                'slug' => 'elizabeth_lev_the_unheard_story_of_the_sistine_chapel',
                                'title' => 'The unheard story of the Sistine Chapel'
                              },
                              {
                                'duration' => 656,
                                'speaker' => 'Oscar Schwartz',
                                'image' => 'https://tedcdnpi-a.akamaihd.net/r/tedcdnpe-a.akamaihd.net/images/ted/952396f5b0b7aa178f6198669818c9b1cf324312_2880x1620.jpg?quality=75&w=500',
                                'title' => 'Can a computer write poetry?',
                                'slug' => 'oscar_schwartz_can_a_computer_write_poetry'
                              },
                              {
                                'slug' => 'wael_ghonim_let_s_design_social_media_that_drives_real_change',
                                'title' => 'Let\'s design social media that drives real change',
                                'image' => 'https://tedcdnpi-a.akamaihd.net/r/tedcdnpe-a.akamaihd.net/images/ted/0da87f5c54fb8855274c9553595d550999e71288_2880x1620.jpg?quality=75&w=500',
                                'duration' => 814,
                                'speaker' => 'Wael Ghonim'
                              },
                              {
                                'slug' => 'aomawa_shields_how_we_ll_find_life_on_other_planets',
                                'title' => 'How we\'ll find life on other planets',
                                'image' => 'https://tedcdnpi-a.akamaihd.net/r/tedcdnpe-a.akamaihd.net/images/ted/b71083deab779a49aa52070dabd282cf96296b38_2880x1620.jpg?quality=75&w=500',
                                'speaker' => 'Aomawa Shields',
                                'duration' => 325
                              },
                              {
                                'title' => 'Governments don\'t understand cyber warfare. We need hackers',
                                'slug' => 'rodrigo_bijou_governments_don_t_understand_cyber_warfare_we_need_hackers',
                                'duration' => 568,
                                'speaker' => 'Rodrigo Bijou',
                                'image' => 'https://tedcdnpi-a.akamaihd.net/r/tedcdnpe-a.akamaihd.net/images/ted/28a3233882ad006a57da361770cec0cbaeab5170_2880x1620.jpg?quality=75&w=500'
                              },
                              {
                                'title' => 'The future of news? Virtual reality',
                                'slug' => 'nonny_de_la_pena_the_future_of_news_virtual_reality',
                                'duration' => 567,
                                'speaker' => "Nonny de la Pe\x{f1}a",
                                'image' => 'https://tedcdnpi-a.akamaihd.net/r/tedcdnpe-a.akamaihd.net/images/ted/ddaf3e1ce01e2c3ee875970d3c7bf8bb4e9e92c3_2880x1620.jpg?quality=75&w=500'
                              },
                              {
                                'image' => 'https://tedcdnpi-a.akamaihd.net/r/tedcdnpe-a.akamaihd.net/images/ted/90eeddc216ca86ad2fbf99d0823a39fe681e7513_2880x1620.jpg?quality=75&w=500',
                                'speaker' => "Andreas Ekstr\x{f6}m",
                                'duration' => 558,
                                'slug' => 'andreas_ekstrom_the_moral_bias_behind_your_search_results',
                                'title' => 'The moral bias behind your search results'
                              }
                            ],
          'precontrol' => bless( do{\(my $o = 0)}, 'JSON::PP::Boolean' ),
          'ratings' => [
                         {
                           'id' => 10,
                           'name' => 'Inspiring',
                           'count' => 259
                         },
                         {
                           'count' => 205,
                           'name' => 'Informative',
                           'id' => 8
                         },
                         {
                           'name' => 'Fascinating',
                           'count' => 164,
                           'id' => 22
                         },
                         {
                           'count' => 136,
                           'name' => 'Persuasive',
                           'id' => 24
                         },
                         {
                           'id' => 3,
                           'name' => 'Courageous',
                           'count' => 17
                         },
                         {
                           'count' => 85,
                           'name' => 'Ingenious',
                           'id' => 9
                         },
                         {
                           'count' => 88,
                           'name' => 'Jaw-dropping',
                           'id' => 23
                         },
                         {
                           'count' => 31,
                           'name' => 'Beautiful',
                           'id' => 1
                         },
                         {
                           'id' => 2,
                           'count' => 12,
                           'name' => 'Confusing'
                         },
                         {
                           'count' => 24,
                           'name' => 'OK',
                           'id' => 25
                         },
                         {
                           'id' => 26,
                           'count' => 11,
                           'name' => 'Obnoxious'
                         },
                         {
                           'id' => 21,
                           'name' => 'Unconvincing',
                           'count' => 8
                         },
                         {
                           'count' => 6,
                           'name' => 'Longwinded',
                           'id' => 11
                         },
                         {
                           'id' => 7,
                           'count' => 4,
                           'name' => 'Funny'
                         }
                       ],
          'language' => 'en',
          'talks' => [
                       {
                         'external' => undef,
                         'slug' => 'tim_berners_lee_the_year_open_data_went_worldwide',
                         'published' => 1268040420,
                         'nativeLanguage' => 'en',
                         'thumb' => 'https://tedcdnpi-a.akamaihd.net/r/tedcdnpe-a.akamaihd.net/images/ted/154673_800x600.jpg?quality=89&w=600',
                         'duration' => 333,
                         'resources' => {
                                          'rtmp' => [
                                                      {
                                                        'file' => 'mp4:talk/stream/2010U/Blank/TimBernersLee_2010U-1500k.mp4',
                                                        'height' => 720,
                                                        'width' => 1280,
                                                        'name' => '1500k',
                                                        'bitrate' => 1500
                                                      },
                                                      {
                                                        'bitrate' => 950,
                                                        'name' => '950k',
                                                        'width' => 854,
                                                        'height' => 480,
                                                        'file' => 'mp4:talk/stream/2010U/Blank/TimBernersLee_2010U-950k.mp4'
                                                      },
                                                      {
                                                        'bitrate' => 600,
                                                        'name' => '600k',
                                                        'width' => 640,
                                                        'height' => 360,
                                                        'file' => 'mp4:talk/stream/2010U/Blank/TimBernersLee_2010U-600k.mp4'
                                                      },
                                                      {
                                                        'name' => '450k',
                                                        'bitrate' => 450,
                                                        'file' => 'mp4:talk/stream/2010U/Blank/TimBernersLee_2010U-450k.mp4',
                                                        'height' => 288,
                                                        'width' => 512
                                                      },
                                                      {
                                                        'bitrate' => 320,
                                                        'name' => '320k',
                                                        'width' => 512,
                                                        'height' => 288,
                                                        'file' => 'mp4:talk/stream/2010U/Blank/TimBernersLee_2010U-320k.mp4'
                                                      },
                                                      {
                                                        'name' => '180k',
                                                        'bitrate' => 180,
                                                        'file' => 'mp4:talk/stream/2010U/Blank/TimBernersLee_2010U-180k.mp4',
                                                        'height' => 288,
                                                        'width' => 512
                                                      },
                                                      {
                                                        'bitrate' => 64,
                                                        'name' => '64k',
                                                        'width' => 398,
                                                        'height' => 224,
                                                        'file' => 'mp4:talk/stream/2010U/Blank/TimBernersLee_2010U-64k.mp4'
                                                      }
                                                    ],
                                          'hls' => {
                                                     'metadata' => 'https://hls.ted.com/talks/788.json',
                                                     'adUrl' => 'https://pubads.g.doubleclick.net/gampad/ads?ciu_szs=300x250%2C512x288%2C120x60%2C320x50%2C6x7%2C6x8&correlator=%5Bcorrelator%5D&cust_params=event%3DTED2010%26id%3D788%26tag%3DInternet%2CTED%2BConference%2Ccomputers%2Cstatistics%2Cvisualizations%2Cweb%26talk%3Dtim_berners_lee_the_year_open_data_went_worldwide%26year%3D2010&env=vp&gdfp_req=1&impl=s&iu=%2F5641%2Fmobile%2Fios%2Fweb&output=xml_vast2&sz=640x360&unviewed_position_start=1&url=%5Breferrer%5D',
                                                     'stream' => 'https://hls.ted.com/talks/788.m3u8'
                                                   },
                                          'h264' => [
                                                      {
                                                        'bitrate' => 320,
                                                        'file' => 'http://download.ted.com/talks/TimBernersLee_2010U-320k.mp4?dnt'
                                                      }
                                                    ]
                                        },
                         'title' => 'The year open data went worldwide',
                         'languages' => [
                                          {
                                            'languageCode' => 'sq',
                                            'ianaCode' => 'sq',
                                            'languageName' => 'Albanian',
                                            'isRtl' => $VAR1->{'precontrol'},
                                            'endonym' => 'Shqip'
                                          },
                                          {
                                            'languageCode' => 'ar',
                                            'ianaCode' => 'ar',
                                            'endonym' => "\x{627}\x{644}\x{639}\x{631}\x{628}\x{64a}\x{629}",
                                            'isRtl' => bless( do{\(my $o = 1)}, 'JSON::PP::Boolean' ),
                                            'languageName' => 'Arabic'
                                          },
                                          {
                                            'isRtl' => $VAR1->{'precontrol'},
                                            'endonym' => "\x{431}\x{44a}\x{43b}\x{433}\x{430}\x{440}\x{441}\x{43a}\x{438}",
                                            'languageName' => 'Bulgarian',
                                            'ianaCode' => 'bg',
                                            'languageCode' => 'bg'
                                          },
                                          {
                                            'languageCode' => 'zh-cn',
                                            'isRtl' => $VAR1->{'precontrol'},
                                            'endonym' => "\x{4e2d}\x{6587} (\x{7b80}\x{4f53})",
                                            'languageName' => 'Chinese, Simplified',
                                            'ianaCode' => 'zh-Hans'
                                          },
                                          {
                                            'languageCode' => 'zh-tw',
                                            'ianaCode' => 'zh-Hant',
                                            'endonym' => "\x{4e2d}\x{6587} (\x{7e41}\x{9ad4})",
                                            'isRtl' => $VAR1->{'precontrol'},
                                            'languageName' => 'Chinese, Traditional'
                                          },
                                          {
                                            'languageCode' => 'hr',
                                            'ianaCode' => 'hr',
                                            'endonym' => 'Hrvatski',
                                            'isRtl' => $VAR1->{'precontrol'},
                                            'languageName' => 'Croatian'
                                          },
                                          {
                                            'languageCode' => 'cs',
                                            'isRtl' => $VAR1->{'precontrol'},
                                            'languageName' => 'Czech',
                                            'endonym' => "\x{10c}e\x{161}tina",
                                            'ianaCode' => 'cs'
                                          },
                                          {
                                            'languageCode' => 'nl',
                                            'endonym' => 'Nederlands',
                                            'isRtl' => $VAR1->{'precontrol'},
                                            'languageName' => 'Dutch',
                                            'ianaCode' => 'nl'
                                          },
                                          {
                                            'languageCode' => 'en',
                                            'ianaCode' => 'en',
                                            'isRtl' => $VAR1->{'precontrol'},
                                            'endonym' => 'English',
                                            'languageName' => 'English'
                                          },
                                          {
                                            'endonym' => "Fran\x{e7}ais",
                                            'isRtl' => $VAR1->{'precontrol'},
                                            'languageName' => 'French',
                                            'ianaCode' => 'fr',
                                            'languageCode' => 'fr'
                                          },
                                          {
                                            'isRtl' => $VAR1->{'precontrol'},
                                            'languageName' => 'German',
                                            'endonym' => 'Deutsch',
                                            'ianaCode' => 'de',
                                            'languageCode' => 'de'
                                          },
                                          {
                                            'languageCode' => 'el',
                                            'ianaCode' => 'el',
                                            'endonym' => "\x{395}\x{3bb}\x{3bb}\x{3b7}\x{3bd}\x{3b9}\x{3ba}\x{3ac}",
                                            'isRtl' => $VAR1->{'precontrol'},
                                            'languageName' => 'Greek'
                                          },
                                          {
                                            'ianaCode' => 'he',
                                            'endonym' => "\x{5e2}\x{5d1}\x{5e8}\x{5d9}\x{5ea}",
                                            'isRtl' => $VAR1->{'talks'}[0]{'languages'}[1]{'isRtl'},
                                            'languageName' => 'Hebrew',
                                            'languageCode' => 'he'
                                          },
                                          {
                                            'endonym' => 'Magyar',
                                            'isRtl' => $VAR1->{'precontrol'},
                                            'languageName' => 'Hungarian',
                                            'ianaCode' => 'hu',
                                            'languageCode' => 'hu'
                                          },
                                          {
                                            'languageCode' => 'id',
                                            'languageName' => 'Indonesian',
                                            'isRtl' => $VAR1->{'precontrol'},
                                            'endonym' => 'Bahasa Indonesia',
                                            'ianaCode' => 'id'
                                          },
                                          {
                                            'languageCode' => 'it',
                                            'ianaCode' => 'it',
                                            'endonym' => 'Italiano',
                                            'isRtl' => $VAR1->{'precontrol'},
                                            'languageName' => 'Italian'
                                          },
                                          {
                                            'isRtl' => $VAR1->{'precontrol'},
                                            'endonym' => "\x{65e5}\x{672c}\x{8a9e}",
                                            'languageName' => 'Japanese',
                                            'ianaCode' => 'ja',
                                            'languageCode' => 'ja'
                                          },
                                          {
                                            'languageCode' => 'ko',
                                            'ianaCode' => 'ko',
                                            'endonym' => "\x{d55c}\x{ad6d}\x{c5b4}",
                                            'isRtl' => $VAR1->{'precontrol'},
                                            'languageName' => 'Korean'
                                          },
                                          {
                                            'languageCode' => 'lv',
                                            'ianaCode' => 'lv',
                                            'isRtl' => $VAR1->{'precontrol'},
                                            'endonym' => "Latvie\x{161}u",
                                            'languageName' => 'Latvian'
                                          },
                                          {
                                            'endonym' => "Lietuvi\x{173} kalba",
                                            'isRtl' => $VAR1->{'precontrol'},
                                            'languageName' => 'Lithuanian',
                                            'ianaCode' => 'lt',
                                            'languageCode' => 'lt'
                                          },
                                          {
                                            'ianaCode' => 'fa',
                                            'languageName' => 'Persian',
                                            'isRtl' => $VAR1->{'talks'}[0]{'languages'}[1]{'isRtl'},
                                            'endonym' => "\x{641}\x{627}\x{631}\x{633}\x{649}",
                                            'languageCode' => 'fa'
                                          },
                                          {
                                            'languageCode' => 'pl',
                                            'isRtl' => $VAR1->{'precontrol'},
                                            'endonym' => 'Polski',
                                            'languageName' => 'Polish',
                                            'ianaCode' => 'pl'
                                          },
                                          {
                                            'languageCode' => 'pt',
                                            'ianaCode' => 'pt',
                                            'isRtl' => $VAR1->{'precontrol'},
                                            'languageName' => 'Portuguese',
                                            'endonym' => "Portugu\x{ea}s de Portugal"
                                          },
                                          {
                                            'endonym' => "Portugu\x{ea}s brasileiro",
                                            'isRtl' => $VAR1->{'precontrol'},
                                            'languageName' => 'Portuguese, Brazilian',
                                            'ianaCode' => 'pt-BR',
                                            'languageCode' => 'pt-br'
                                          },
                                          {
                                            'languageCode' => 'ro',
                                            'languageName' => 'Romanian',
                                            'isRtl' => $VAR1->{'precontrol'},
                                            'endonym' => "Rom\x{e2}n\x{103}",
                                            'ianaCode' => 'ro'
                                          },
                                          {
                                            'languageCode' => 'ru',
                                            'isRtl' => $VAR1->{'precontrol'},
                                            'languageName' => 'Russian',
                                            'endonym' => "\x{420}\x{443}\x{441}\x{441}\x{43a}\x{438}\x{439}",
                                            'ianaCode' => 'ru'
                                          },
                                          {
                                            'ianaCode' => 'sk',
                                            'endonym' => "Sloven\x{10d}ina",
                                            'isRtl' => $VAR1->{'precontrol'},
                                            'languageName' => 'Slovak',
                                            'languageCode' => 'sk'
                                          },
                                          {
                                            'ianaCode' => 'es',
                                            'isRtl' => $VAR1->{'precontrol'},
                                            'endonym' => "Espa\x{f1}ol",
                                            'languageName' => 'Spanish',
                                            'languageCode' => 'es'
                                          },
                                          {
                                            'ianaCode' => 'sv',
                                            'endonym' => 'Svenska',
                                            'isRtl' => $VAR1->{'precontrol'},
                                            'languageName' => 'Swedish',
                                            'languageCode' => 'sv'
                                          },
                                          {
                                            'ianaCode' => 'tr',
                                            'isRtl' => $VAR1->{'precontrol'},
                                            'languageName' => 'Turkish',
                                            'endonym' => "T\x{fc}rk\x{e7}e",
                                            'languageCode' => 'tr'
                                          },
                                          {
                                            'ianaCode' => 'uk',
                                            'languageName' => 'Ukrainian',
                                            'isRtl' => $VAR1->{'precontrol'},
                                            'endonym' => "\x{423}\x{43a}\x{440}\x{430}\x{457}\x{43d}\x{441}\x{44c}\x{43a}\x{430}",
                                            'languageCode' => 'uk'
                                          },
                                          {
                                            'languageCode' => 'vi',
                                            'languageName' => 'Vietnamese',
                                            'isRtl' => $VAR1->{'precontrol'},
                                            'endonym' => "Ti\x{1ebf}ng Vi\x{1ec7}t",
                                            'ianaCode' => 'vi'
                                          }
                                        ],
                         'speaker' => 'Tim Berners-Lee',
                         'postAdDuration' => '0.83',
                         'filmed' => 1265798100,
                         'targeting' => {
                                          'tag' => 'Internet,TED Conference,computers,statistics,visualizations,web',
                                          'talk' => 'tim_berners_lee_the_year_open_data_went_worldwide',
                                          'id' => 788,
                                          'event' => 'TED2010',
                                          'year' => '2010'
                                        },
                         'adDuration' => '3.33',
                         'id' => 788,
                         'name' => 'Tim Berners-Lee: The year open data went worldwide',
                         'isSubtitleRequired' => $VAR1->{'precontrol'},
                         'nativeDownloads' => {
                                                'medium' => 'http://download.ted.com/talks/TimBernersLee_2010U.mp4?apikey=489b859150fc58263f17110eeb44ed5fba4a3b22',
                                                'low' => 'http://download.ted.com/talks/TimBernersLee_2010U-light.mp4?apikey=489b859150fc58263f17110eeb44ed5fba4a3b22',
                                                'high' => 'http://download.ted.com/talks/TimBernersLee_2010U-480p.mp4?apikey=489b859150fc58263f17110eeb44ed5fba4a3b22'
                                              },
                         'subtitledDownloads' => {
                                                   'tr' => {
                                                             'name' => 'Turkish',
                                                             'low' => 'http://download.ted.com/talks/TimBernersLee_2010U-low-tr.mp4',
                                                             'high' => 'http://download.ted.com/talks/TimBernersLee_2010U-480p-tr.mp4'
                                                           },
                                                   'nl' => {
                                                             'name' => 'Dutch',
                                                             'high' => 'http://download.ted.com/talks/TimBernersLee_2010U-480p-nl.mp4',
                                                             'low' => 'http://download.ted.com/talks/TimBernersLee_2010U-low-nl.mp4'
                                                           },
                                                   'lt' => {
                                                             'name' => 'Lithuanian',
                                                             'low' => 'http://download.ted.com/talks/TimBernersLee_2010U-low-lt.mp4',
                                                             'high' => 'http://download.ted.com/talks/TimBernersLee_2010U-480p-lt.mp4'
                                                           },
                                                   'sv' => {
                                                             'name' => 'Swedish',
                                                             'low' => 'http://download.ted.com/talks/TimBernersLee_2010U-low-sv.mp4',
                                                             'high' => 'http://download.ted.com/talks/TimBernersLee_2010U-480p-sv.mp4'
                                                           },
                                                   'zh-tw' => {
                                                                'low' => 'http://download.ted.com/talks/TimBernersLee_2010U-low-zh-tw.mp4',
                                                                'high' => 'http://download.ted.com/talks/TimBernersLee_2010U-480p-zh-tw.mp4',
                                                                'name' => 'Chinese, Traditional'
                                                              },
                                                   'pt' => {
                                                             'name' => 'Portuguese',
                                                             'low' => 'http://download.ted.com/talks/TimBernersLee_2010U-low-pt.mp4',
                                                             'high' => 'http://download.ted.com/talks/TimBernersLee_2010U-480p-pt.mp4'
                                                           },
                                                   'ja' => {
                                                             'name' => 'Japanese',
                                                             'low' => 'http://download.ted.com/talks/TimBernersLee_2010U-low-ja.mp4',
                                                             'high' => 'http://download.ted.com/talks/TimBernersLee_2010U-480p-ja.mp4'
                                                           },
                                                   'ro' => {
                                                             'name' => 'Romanian',
                                                             'high' => 'http://download.ted.com/talks/TimBernersLee_2010U-480p-ro.mp4',
                                                             'low' => 'http://download.ted.com/talks/TimBernersLee_2010U-low-ro.mp4'
                                                           },
                                                   'ko' => {
                                                             'name' => 'Korean',
                                                             'high' => 'http://download.ted.com/talks/TimBernersLee_2010U-480p-ko.mp4',
                                                             'low' => 'http://download.ted.com/talks/TimBernersLee_2010U-low-ko.mp4'
                                                           },
                                                   'pt-br' => {
                                                                'name' => 'Portuguese, Brazilian',
                                                                'low' => 'http://download.ted.com/talks/TimBernersLee_2010U-low-pt-br.mp4',
                                                                'high' => 'http://download.ted.com/talks/TimBernersLee_2010U-480p-pt-br.mp4'
                                                              },
                                                   'lv' => {
                                                             'high' => 'http://download.ted.com/talks/TimBernersLee_2010U-480p-lv.mp4',
                                                             'low' => 'http://download.ted.com/talks/TimBernersLee_2010U-low-lv.mp4',
                                                             'name' => 'Latvian'
                                                           },
                                                   'bg' => {
                                                             'name' => 'Bulgarian',
                                                             'high' => 'http://download.ted.com/talks/TimBernersLee_2010U-480p-bg.mp4',
                                                             'low' => 'http://download.ted.com/talks/TimBernersLee_2010U-low-bg.mp4'
                                                           },
                                                   'ar' => {
                                                             'low' => 'http://download.ted.com/talks/TimBernersLee_2010U-low-ar.mp4',
                                                             'high' => 'http://download.ted.com/talks/TimBernersLee_2010U-480p-ar.mp4',
                                                             'name' => 'Arabic'
                                                           },
                                                   'sk' => {
                                                             'low' => 'http://download.ted.com/talks/TimBernersLee_2010U-low-sk.mp4',
                                                             'high' => 'http://download.ted.com/talks/TimBernersLee_2010U-480p-sk.mp4',
                                                             'name' => 'Slovak'
                                                           },
                                                   'he' => {
                                                             'high' => 'http://download.ted.com/talks/TimBernersLee_2010U-480p-he.mp4',
                                                             'low' => 'http://download.ted.com/talks/TimBernersLee_2010U-low-he.mp4',
                                                             'name' => 'Hebrew'
                                                           },
                                                   'cs' => {
                                                             'high' => 'http://download.ted.com/talks/TimBernersLee_2010U-480p-cs.mp4',
                                                             'low' => 'http://download.ted.com/talks/TimBernersLee_2010U-low-cs.mp4',
                                                             'name' => 'Czech'
                                                           },
                                                   'vi' => {
                                                             'low' => 'http://download.ted.com/talks/TimBernersLee_2010U-low-vi.mp4',
                                                             'high' => 'http://download.ted.com/talks/TimBernersLee_2010U-480p-vi.mp4',
                                                             'name' => 'Vietnamese'
                                                           },
                                                   'el' => {
                                                             'low' => 'http://download.ted.com/talks/TimBernersLee_2010U-low-el.mp4',
                                                             'high' => 'http://download.ted.com/talks/TimBernersLee_2010U-480p-el.mp4',
                                                             'name' => 'Greek'
                                                           },
                                                   'ru' => {
                                                             'name' => 'Russian',
                                                             'high' => 'http://download.ted.com/talks/TimBernersLee_2010U-480p-ru.mp4',
                                                             'low' => 'http://download.ted.com/talks/TimBernersLee_2010U-low-ru.mp4'
                                                           },
                                                   'sq' => {
                                                             'name' => 'Albanian',
                                                             'high' => 'http://download.ted.com/talks/TimBernersLee_2010U-480p-sq.mp4',
                                                             'low' => 'http://download.ted.com/talks/TimBernersLee_2010U-low-sq.mp4'
                                                           },
                                                   'fr' => {
                                                             'name' => 'French',
                                                             'high' => 'http://download.ted.com/talks/TimBernersLee_2010U-480p-fr.mp4',
                                                             'low' => 'http://download.ted.com/talks/TimBernersLee_2010U-low-fr.mp4'
                                                           },
                                                   'uk' => {
                                                             'name' => 'Ukrainian',
                                                             'low' => 'http://download.ted.com/talks/TimBernersLee_2010U-low-uk.mp4',
                                                             'high' => 'http://download.ted.com/talks/TimBernersLee_2010U-480p-uk.mp4'
                                                           },
                                                   'id' => {
                                                             'high' => 'http://download.ted.com/talks/TimBernersLee_2010U-480p-id.mp4',
                                                             'low' => 'http://download.ted.com/talks/TimBernersLee_2010U-low-id.mp4',
                                                             'name' => 'Indonesian'
                                                           },
                                                   'de' => {
                                                             'name' => 'German',
                                                             'low' => 'http://download.ted.com/talks/TimBernersLee_2010U-low-de.mp4',
                                                             'high' => 'http://download.ted.com/talks/TimBernersLee_2010U-480p-de.mp4'
                                                           },
                                                   'zh-cn' => {
                                                                'low' => 'http://download.ted.com/talks/TimBernersLee_2010U-low-zh-cn.mp4',
                                                                'high' => 'http://download.ted.com/talks/TimBernersLee_2010U-480p-zh-cn.mp4',
                                                                'name' => 'Chinese, Simplified'
                                                              },
                                                   'en' => {
                                                             'name' => 'English',
                                                             'high' => 'http://download.ted.com/talks/TimBernersLee_2010U-480p-en.mp4',
                                                             'low' => 'http://download.ted.com/talks/TimBernersLee_2010U-low-en.mp4'
                                                           },
                                                   'es' => {
                                                             'low' => 'http://download.ted.com/talks/TimBernersLee_2010U-low-es.mp4',
                                                             'high' => 'http://download.ted.com/talks/TimBernersLee_2010U-480p-es.mp4',
                                                             'name' => 'Spanish'
                                                           },
                                                   'pl' => {
                                                             'high' => 'http://download.ted.com/talks/TimBernersLee_2010U-480p-pl.mp4',
                                                             'low' => 'http://download.ted.com/talks/TimBernersLee_2010U-low-pl.mp4',
                                                             'name' => 'Polish'
                                                           },
                                                   'fa' => {
                                                             'low' => 'http://download.ted.com/talks/TimBernersLee_2010U-low-fa.mp4',
                                                             'high' => 'http://download.ted.com/talks/TimBernersLee_2010U-480p-fa.mp4',
                                                             'name' => 'Persian'
                                                           },
                                                   'it' => {
                                                             'low' => 'http://download.ted.com/talks/TimBernersLee_2010U-low-it.mp4',
                                                             'high' => 'http://download.ted.com/talks/TimBernersLee_2010U-480p-it.mp4',
                                                             'name' => 'Italian'
                                                           },
                                                   'hu' => {
                                                             'high' => 'http://download.ted.com/talks/TimBernersLee_2010U-480p-hu.mp4',
                                                             'low' => 'http://download.ted.com/talks/TimBernersLee_2010U-low-hu.mp4',
                                                             'name' => 'Hungarian'
                                                           },
                                                   'hr' => {
                                                             'low' => 'http://download.ted.com/talks/TimBernersLee_2010U-low-hr.mp4',
                                                             'high' => 'http://download.ted.com/talks/TimBernersLee_2010U-480p-hr.mp4',
                                                             'name' => 'Croatian'
                                                           }
                                                 },
                         'event' => 'TED2010',
                         'streamer' => 'rtmp://cp358131.edgefcs.net/ted',
                         'introDuration' => '11.82',
                         'audioDownload' => 'http://download.ted.com/talks/TimBernersLee_2010U.mp3?apikey=489b859150fc58263f17110eeb44ed5fba4a3b22',
                         'shareUrl' => 'http://www.ted.com/talks/tim_berners_lee_the_year_open_data_went_worldwide',
                         'canonical' => 'http://www.ted.com/talks/tim_berners_lee_the_year_open_data_went_worldwide'
                       }
                     ]
        };

There is lots of interesting data in that JSON dump that we might be able to use to build nice applications.

TODO

In order to be able to implement either of the ideas for TED I'll also need to fetch a large list of talks, but let's leave that for another day and another article.