How I did it.
I got a ton of emails today asking how exactly I ripped the URL's for the Live 8 Feeds.
Also, this blog has been linked from tons of other places. Well, here you go.
First, this isn't going to work on any windows machine, you need a unix-ish OS for this to work.
First:
make and cd into a temp directory
wget the 2 files that will provide you with a list of artist IDs
wget "http://music.channel.aol.com/live_8_concert/berlin_paris_rome" \ "http://music.channel.aol.com/live_8_concert/london_philly_toronto"
next, parse those two files
cat berlin_paris_rome london_philly_toronto |\ grep -e artistid|tr '<>' '\n'|grep -e ^a|cut '-d"' -f2|\ tr '&?' '\n'|grep -e artistid|cut '-d=' -f2|tr -d '\r' > artistids
then get the artist index pages for each artist.
mkdir artists cat artistids |while read artistid;do wget -O artists/$artistid "http://music.aol.com/artist/main.adp?tab=songvid&artistid=$artistid" -nv done
Next, extract a list of pmmsids from the artist pages, only for Live 8.
grep -e OpenFullPMMSID artists/* |grep -e 'LIVE 8'|cut "-d'" -f4 > pmmsids
then grab the page that actually shows the video using the pmmsid as a refrence.
Your first instinct will probably be to try and grab this URL
http://mp.aol.com/video.index.adp?pmmsid=1357877
But, that's wrong. Because we aren't using any cookie store with wget AOL will prompt for what speed video you want.
Obviously I want the highest possible bitrate, so let's trick them by getting this URL.
http://mp.aol.com/speed.adp?speed=bb&url=/video.index.adp?pmmsid= 1357877
this is slightly inefficent, but works.
This makes the final code look something like this:
cat pmmsids|while read line;do wget "http://mp.aol.com/speed.adp?speed=bb&url=/video.index.adp?pmmsid=$line" -O pmm/$line -nv done
Run these files through grep & awk.
grep -e PlayVideoLarge pmm/*|tr ';' '"'\
|awk '-F"' '{printf "http://pdl.stream.aol.com";printf $2;print "_dl.mov"}'
Viola, you have your list.
More to come later.
8 Comments »
RSS feed for comments on this post. TrackBack URI
Leave a comment
Line and paragraph breaks automatic, e-mail address never displayed, HTML allowed: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <code> <em> <i> <strike> <strong>
July 6th, 2005 @ 04:33:07
genius. pure genius. Well, on second thought, 93% persperation, 6% electricity, 4% evaporation, and 2% total geekery.
July 7th, 2005 @ 07:59:44
Hey, you should turn the files into torrents and post them, so that you dont hit bandwidth limits on your blog account. Let us know what the torrent URL is so that we can download them that way.
August 25th, 2005 @ 15:44:34
Your site has made me smile
August 26th, 2005 @ 03:34:35
Your website is beautifully decorated and easily navigated. I have enjoyed visiting this site today and hope to visit many more times in the future.
August 30th, 2005 @ 08:44:45
Your site is pure sonshine
September 8th, 2005 @ 00:24:44
Thank you, friends, for your sharing your ideas
September 8th, 2005 @ 22:43:31
I really like your site
September 10th, 2005 @ 22:02:28
Great Design and useful information. I will be back soon!