-
Notifications
You must be signed in to change notification settings - Fork 2
/
Copy pathjapanesepod-dl.pl
executable file
·854 lines (734 loc) · 43.9 KB
/
japanesepod-dl.pl
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
#!/usr/bin/perl -w
# Ruzgfpegk
# 2015-04-20 - v0.8d
#
# Script covered by the WTFPL
#
# This script:
# - Reads a XML file which is a RSS feed already downloaded from JapanesePod101.com
# - Creates a folder hierarchy
# - Writes text files when a RSS description is added to an item
# - Downloads the files into these folders and sets their modification time to the item date
#
# For instance, an item titled "Lower Intermediate Lesson #46 - Mothers' Talk - Audio" will be saved as
# Audio.mp3 in the folder "Lower Intermediate Lesson #46 - Mothers' Talk".
#
# This may or may not be against the Terms of Service of the website.
# You'll need an active membership to JP101 : http://www.japanesepod101.com/
# A free account will let you download the latest 10 audio lessons.
# http://www.japanesepod101.com/helpcenter/getstarted/downloadhowto # Links for the free feed (RSS, you can use it there)
# http://www.japanesepod101.com/helpcenter/getstarted/itunesfeeds # Links for basic and premium feeds (ITPC only ?)
# http://www.japanesepod101.com/learningcenter/account/myfeed # Create your own RSS (premium)
#
# Known issues:
# - Windows is NOT supported. This is because many Perl5 modules don't support Unicode filepaths under Win32.
# - If you cancel (Ctrl+C) the process during a download, the last file won't be fully saved
# but should be downloaded next time as the size will be different from the one in the XML.
# - The size estimation doesn't take into account what has already been downloaded (it's only based on the RSS contents).
# - The download order is NOT the RSS order. Processing is done in sorted order of the path names.
# This can be an issue if files of a folder are in different parts of the RSS and you use dl_limit.
# - This script can be VERY verbose. It's because the RSS file their website generates is full of shit
# like multiple different entries for the same file.
# - The XML files generated by JP101 are moving targets. Errors appear and dissapear, lessons are moved among sections, ...
# so I can't guarantee a clean result.
use strict;
use warnings;
use utf8; # This file uses Unicode characters
use IO::File; # To create files cleanly
use File::Path qw(make_path); # To create paths cleanly
use Time::Piece; # To parse dates in the XML file
use Unicode::Collate; # Goddammit Perl. See http://www.perl.com/pub/2011/08/whats-wrong-with-sort-and-how-to-fix-it.html
# We could use File::Spec::Functions to handle filepaths as it's included in ActivePerl, but it's not in the OpenSuSE repository.
require LWP::UserAgent; # To download the files
require XML::Simple; # To parse the XML file
## Configuration
my %conf = (
'XMLFile' => '/path/to/your/JapanesePod101.com.xml',
'OutDir' => '/path/to/your/target/folder',
'user' => 'YOUR_JP101_USERNAME',
'pass' => 'YOUR_JP101_PASSWORD',
'pretend' => 0, # To avoid downloading files (just create folders and text files)
'overwrite' => 0, # To overwrite old text files or already downloaded files (why?). "1-Length" items not included.
'dl_limit' => 0, # How many max files to download at each run? 0 for max.
'verbose' => 0, # To show each downloaded file on the console
'throttle' => 1, # How many seconds should we wait between two files?
'fallback' => q{_}, # In paths, by what character do we replace illegal ones? (like '?' or '*')
'separator' => q{/}, # To do things cleanly, use \\ in Windows and / in Linux/OSX.
'wait' => 3, # How many seconds to wait before starting?
'nosummary' => 0, # Don't create txt files. Useful while debugging or if you already have them.
'nofolders' => 0, # Don't even create folders. Useful while debugging regexpes.
'skipOredl' => 1, # Skip redownloading of non-audio files (JP101 RSS indicates wrong values for PDFs)
'unicodize' => 1, # Replace illegal NTFS characters by their Unicode full-width equivalent
'hierarchy' => 1, # Use the official lesson hierarchy instead of dumping everything in the same folder
'duplicates'=> 1, # If you want to download duplicates instead of skipping them (which can skip good files and DL bad ones)
'UserAgent' => 'iTunes/10.2.1 (Macintosh; Intel Mac OS X 10.7) AppleWebKit/534.20.8',
'sections' => '01234XZ', # If you only want to download section 0, put '0'. For everything, '01234XZ'. Use with 'hierarchy'=>1.
);
my %hierarchy = (
# Up-to-date as of 2015-04-12. Please update this part as needed. Non-sorted items will download at the root folder.
'0 - Introduction' => [
# Audio
'Culture Class: Holidays in Japan',
'Introduction',
'All About',
'Japanese Culture Classes',
],
'1 - Absolute Beginner' => [
# Audio
'Survival Phrases Season 1',
'Survival Phrases Season 2',
'Newbie Season 1',
'Newbie Season 2',
'Newbie Season 3',
'Newbie Season 4',
'Newbie Season 5',
'Absolute Beginner Season 1',
'Absolute Beginner Season 2',
'Top 25 Japanese Questions You Need to Know',
'Culture Class: Essential Japanese Vocabulary',
# Video
'Absolute Beginner Questions Answered by Hiroko',
'Basic Japanese',
'Innovative Japanese',
'Japanese Body Language and Gestures',
'Just For Fun',
'Kanji Videos with Hiroko',
'Kantan Kana',
'Learn Japanese Grammar Video - Absolute Beginner',
'Learn with Pictures and Video',
'Ultimate Japanese Pronunciation Guide',
'Absolute Beginner Japanese for Every Day',
],
'2 - Beginner' => [
# Audio
'Beginner Season 1',
'Beginner Season 2',
'Beginner Season 3',
'Beginner Season 4',
'Beginner Season 5',
'Beginner Season 6',
'Lower Beginner Season 1', # Changed because the website doesn't state "Season 1"
'Lower Beginner Season 2',
'Upper Beginner Season 1',
# Video
'Everyday Kanji',
'Japanese Counters for Beginners',
'Japanese Words of the Week with Risa for Beginners',
'Learn with Video',
'Video Vocab Season 1',
],
'3 - Intermediate' => [
# Audio
'Japanese for Everyday Life Lower Intermediate',
'Lower Intermediate Season 1',
'Lower Intermediate Season 2',
'Lower Intermediate Season 3',
'Lower Intermediate Season 4',
'Lower Intermediate Season 5',
'Lower Intermediate Season 6',
'Intermediate Season 1',
'Upper Intermediate Season 1',
'Upper Intermediate Season 2',
'Upper Intermediate Season 3',
'Upper Intermediate Season 4',
'Upper Intermediate Season 5',
# Videos
'iLove J-Drama',
'Japanese Words of the Week with Risa for Intermediate Learners',
'Journey Through Japan',
'Learning Japanese through Posters',
'Must-Know Japanese Holiday Words'
],
'4 - Advanced' => [
# Audio
'Advanced Audio Blog 1',
'Advanced Audio Blog 2',
'Advanced Audio Blog 3',
'Advanced Audio Blog 4',
'Advanced Audio Blog 5',
'Advanced Audio Blog 6',
# Video
'Video Culture Class: Japanese Holidays',
],
'X - Bonus Video Courses' => [
# Video
'Japanese Listening Comprehension for Absolute Beginners',
'Japanese Listening Comprehension for Beginners',
'Japanese Listening Comprehension for Intermediate Learners',
'Japanese Listening Comprehension for Advanced Learners',
'Video Prototype Lessons',
],
'X - Bonus Courses' => [
# Audio
'Inner Circle',
'Prototype Lessons',
'JLPT Season 1 - Old 4/New N5',
'JLPT Season 2 - New N4',
'JLPT Season 3 - New N3',
'Japanese Children\'s Songs',
'Onomatopoeia',
'Particles',
'Yojijukugo',
'Extra Fun',
],
'Z - News & Announcement' => [
# Audio
'Cheat Sheet to Mastering Japanese',
'News'
],
);
my $re_title_2P = qr/^
(.+?) # Topic ($1)
( # ($2) :
(?:\ -\ ) # Usual separator ' - '
| # Or
(?:\ \#\d+?[^0-9\ ]\ ) # ' #\d+[_-] ' (the ' #2: ' example)
)
(.+) # Description: everything (inc. subdescription) until... ($3)
(?:\ -\ ) # Usual separator
(.*) # Last part (file title) ($4)
$/xmos;
my $re_title_1 = qr/^
(.+?) # Topic ($1)
( # ($2) :
(?:\ -\ ) # Usual separator ' - '
| # Or
(?:\ \#\d+?[[^0-9\ ]\ ) # ' #\d+[_-] ' (the ' #2: ' example)
)
(.+) # Description: everything (inc. subdescription) until... ($3)
$/xmos;
my %url_blacklist = ();
my @errors;
my %dl_list;
## Tweaking
# We want to see when the file is downloading... before the completed line (with "done") is sent to the console.
STDOUT->autoflush(1);
# We want to avoid "Wide character in" (print, die, ...) warnings in the console.
binmode STDOUT, ':encoding(UTF-8)';
binmode STDERR, ':encoding(UTF-8)';
## libwww initialization
my $ua = LWP::UserAgent->new;
# An HTTP identification is required for most files. If they ever change their "realm name" (second string), change it here too.
$ua->credentials( 'www.japanesepod101.com:80', 'JapanesePod101.com Premium Feed - Requires an Active Premium Subscription!',
$conf{'user'}, $conf{'pass'} );
# Those podcasts are supposed to be played through iTunes. So let's pretend we're iTunes to avoid looking suspicious.
# Feel free to update it too, using websites like http://udger.com/resources/ua-list/browser-detail?browser=iTunes
$ua->agent( $conf{'UserAgent'} );
## XML Parser initialization
print 'Loading RSS file (please be patient)... ';
my $xs = XML::Simple->new();
my $ref = $xs->XMLin( $conf{'XMLFile'} );
print "done.\n";
# Uncomment the following if you need to debug the data structure. You shouldn't have to.
#use Data::Dumper;
#print Dumper( $ref->{'channel'}->{'item'} );
## Initial folder creation
# Create output folder if it doesn't exist
if( ! -f $conf{'OutDir'} )
{
make_path( $conf{'OutDir'} ); # make_path does carp/croak by itself
}
chdir $conf{'OutDir'};
## Checks
die 'ERROR: Can\'t have a negative waiting time!' if $conf{'wait'} < 0;
## Display of the summary of what's to come
# Calculate the total size to download...
my $size_total = 0;
my $size_session = 0;
my $filecount = 0;
for my $item ( @{$ref->{'channel'}->{'item'}} )
{
next if ref($item->{'title'}) eq 'HASH'; # Skip empty entries. See further down.
$filecount++;
$size_total += $item->{'enclosure'}->{'length'};
if( $filecount == $conf{'dl_limit'} ) { $size_session = $size_total }
}
if( $filecount < $conf{'dl_limit'} ) { $size_session = $size_total }
print
$conf{'pretend'}
? 'Pretending to download '
: 'Downloading '
,
$conf{'dl_limit'} > 0
? "the first $conf{'dl_limit'} undownloaded files out of $filecount"
: "$filecount files"
,
".\n"
;
print 'Total download size: ',
$conf{'dl_limit'} > 0
? sprintf('%.2f', ($size_session/1024)/1024) . 'MB out of ' . sprintf('%.2f', ($size_total/1024)/1024)
: sprintf('%.2f', ($size_total/1024)/1024) . 'MB'
,
" (incorrect values if resuming).\n"
;
print "Download will start in $conf{'wait'} second" . ($conf{'wait'} > 1 ? q{s} : q{}) . ". Press Ctrl+C now to abort.\n\n";
sleep $conf{'wait'};
## Various variable initialization
my $dl_count = 0;
## The big loop to download everything
for my $item ( @{$ref->{'channel'}->{'item'}} )
{
# Skip immediately if the URL of the file is known as shitty
next if( defined $url_blacklist{$item->{'enclosure'}->{'url'}} );
# Each item (PDF, Audio, ...) is processed in order
my $title = $item->{'title'};
# Skip bogus (empty) entries (title would be an empty hash instead of a string)
next if ref($title) eq 'HASH';
# Enclosure/url is the file URL : if it doesn't exist we shouldn't try to download it.
# 'guid' isn't always set so we don't use it.
if( ! $item->{'enclosure'}->{'url'} )
{
warn "WARNING: URL not found for item $title!\n";
next;
}
# Fix every mistake they made and rename for consistency
# Starting by small fixes. Count obtained with command lines like this (remove .* when ^ or $):
# $ grep -P "<title>.*THING_YOU_SEARCH.*</title>" JapanesePod101.com-20150411.xml | wc -l
# Replace multiple spaces by only one (506 titles on the 20150411 listing)
$title =~ s/\s{2,}/ /g;
# Remove spaces between # and a number ('# 2' -> '#2') (65 titles on the 20150411 listing)
$title =~ s/#\s(\d)/#$1/g;
## Replace long dashes by dashes, v0 (42 titles on the 20150411 listing)
#$title =~ s/(\w)—(\w)/$1 - $2/; # Probably not necessary
# Replace not-really-dashes by dashes and not-really-quotes by quotes (27 and 230 titles on the 20150411 listing)
$title =~ tr/–’/-'/;
# Add a space between number and dashes ('#2-' -> '#2 -') and changes '#2:' to '#2 -' (312 titles on the 20150411 listing)
$title =~ s/(#\d+)[:\-]/$1 -/;
# Remove leading space (6 titles on the 20150411 listing)
$title =~ s/^\s(.*?)$/$1/;
# Remove trailing space (64 titles on the 20150411 listing)
$title =~ s/^(.*?)\s$/$1/;
# Title mistakes (taking into account previous corrections!!!)
# 0 - Introduction
## Japanese Culture Classes
$title =~ s/^(Japanese Culture Class #4[12]) - Aomori-ben #/$1 - Aomori Dialect /; # Other lessons are about "Dialect", not "ben". And remove the '#' (not in 45, 46, 47...) # Checked 20150419 GO
# "Japanese Culture Class" lessons are off-numbered from the 47th entry included # Checked 20150419 GO
if( $title =~ /^Japanese Culture Class #(\d+)/ && $1 >= 47 )
{
my $newnumber = $1-1;
$title =~ s/#\d+/#$newnumber/;
}
$title =~ s/^(Happy New Year from JapanesePod101\.com!)/Japanese Culture Class #58 - $1/; # Badly sorted name # Checked 20150419 GO
# 1 - Absolute Beginner
## Kantan Kana
$title =~ s/^(Kantan Kana #1) - (Lesson Notes(?: Lite)?|Video)$/$1 - Hiragana vowels あ, い, う, え, お - $2/; # Because of old "phantom" entries (with an additional file) # Checked 20150414 GO
## Learn with Pictures and Video
$title =~ s/^(Learn with Pictures and Video) S4/$1/; # S4? Dafuq, there is only one season. # Checked 20150419 GO
## Newbie Season 1
$title =~ s/^Newbie #7 - (Learn to Count in Japanese!)/Newbie #6 - $1/; # Wrong number # Checked 20150419 GO
$title =~ s/^Newbie #11 - (Rise and Shine 2)/Newbie #10 - $1/; # Wrong number # Checked 20150419 GO
$title =~ s/^Newbie Lesson #12 - (Winter Blues)/Newbie Lesson #11 - $1/; # Wrong number # Checked 20150419 GO
$title =~ s/^Newbie Lesson #13 - (I'm Starving!)/Newbie Lesson #12 - $1/; # Wrong number # Checked 20150419 GO
$title =~ s/^Newbie Lesson #14 - (Making the Grade)/Newbie Lesson #13 - $1/; # Wrong number # Checked 20150419 GO
$title =~ s/^Newbie Lesson #15 - (When's Your Birthday\?)/Newbie Lesson #14 - $1/; # Wrong number # Checked 20150419 GO
$title =~ s/^Newbie Lesson #18 - (To Exist or Not to Exist)/Newbie Lesson #15 - $1/; # Wrong number # Checked 20150419 GO
$title =~ s/^Newbie Lesson #24 - (How Much Will [\s\w]+\?)/Newbie Lesson #21 - $1/; # Wrong number # Checked 20150419 GO
$title =~ s/^Newbie Lesson #25 - (Where Are You Headed\?)/Newbie Lesson #22 - $1/; # Wrong number # Checked 20150419 GO
## Newbie Season 3
$title =~ s/^(Newbie Lesson S3 #3 -)(Nihongo Dōjō)/$1 $2/; # Space missing # Checked 20150419 GO
## Newbie Season 4
$title =~ s/^Newbie Lesson #4 - S4:/Newbie S4 #4 -/; # Name mixed up on duplicate. # Checked 20150419 GO
$title =~ s/^(Newbie Lesson S4 #7 -) S4: (How to Say Where Things Are)/$1 $2/; # Useless 'S4: ' # Checked 20150419 GO
$title =~ s/^(Newbie Lesson S4 #13) -/$1/; # Useless ': ' (became ' - ') # Checked 20150419 GO
## Survival Phrases Season 1
$title =~ s/^(Survival Phrases #\d+ - CSP) (\d)/$1$2/; # 4,7,9,10: The others don't have a space after CSP # Checked 20150419 GO
$title =~ s/^(Survival Phrases #\d+ - CSP[123]) /$1 - /; # 1,2,3: ' -' missing. # Checked 20150419 GO
# 2 - Beginner
## Beginner Season 4
$title =~ s/^(Beginner Lesson S4 #1[35] - )- /$1/; # Useless ":" in lessons 13 & 15 # Checked 20150419 GO
## Learn with Video
$title =~ s/^(Learn with Video) S2 #20 /$1 #20 /; # There is no S2 # Checked 20150419 GO
## Video Vocab Season 1
$title =~ s/^(Video Vocab Lesson #)/Video Vocab Season 1 #/; # Consistency # Checked 20150419 GO
# 3 - Intermediate
## iLove J-Drama
$title =~ s/^(iLove Chapter [IV]+ - .*? -) iLove -/$1/; # Remove useless "iLove". [IV] because we didn't reach X yet. # Checked 20150419 GO
$title =~ s/^(iLove Chapter IV - Nanase's Choice) -$/$1 - (No Subs)/; # Not '- $' because of the trailing space removal # Checked 20150419 GO
$title =~ s/^(iLove J-Drama #2 - iLove Chapter II) - ([^i].*)$/$1 - iLove - $2/; # Oh well. Chapter title "iLove" missing. # Checked 20150419 GO
# iLove J-Drama are sometimes numbered with latin numerals
if( $title =~ /^iLove Chapter ([IV]+) -/ )
{
my %eq = ( 'I' => 1, 'II' => 2, 'III' => 3, 'IV' => 4, 'V' => 5, 'VI' => 6, 'VII' => 7, 'VIII' => '8' ); # Please complete if needed
my $number = $eq{$1};
$title =~ s/^(iLove Chapter)/iLove J-Drama #$number - $1/;
}
## Journey Through Japan
$title =~ s/^(Journey Through Japan #\d - .*?) - Journey Through Japan/$1/; # Useless repeat # Checked 20150419 GO
## Learning Japanese through Posters
$title =~ s/^(Learning Japanese Through Poster Phrases #)/Learning Japanese through Posters $1/; # Name differs from category. # Checked 20150419 GO
## Lower Intermediate Season 2
$title =~ s/^(Lower Intermediate Lesson S2 #3 - Making the) Rug Rats/$1 Rugrats/; # In #3 use the same term as in #4 # Checked 20150419 GO
## Lower Intermediate Season 3
$title =~ s/^(Lower Intermediate Lesson S3 #[67] - How to Meet People) on Line/$1 Online/; # Use for #6 and #7 the term in #8 # Checked 20150419 GO
$title =~ s/^(Lower Intermediate Lesson S3 #([1-4]) - First Time in an Onsen! What Should I Do\?)/$1 $2/; # Number the lessons # Checked 20150419 GO
## Lower Intermediate Season 4
$title =~ s/^(Lower Intermediate Lesson S4 #[1-4] - Giving and Receiving in Japanese)-/$1 - /; # Add spaces # Checked 20150419 GO
# 4 - Advanced
## Advanced Audio Blog 1
$title =~ s/^(Audio Blog) #70 - (Sapporo Snow Festival)/$1 #69 - $2/; # Wrong number # Checked 20150419 GO
$title =~ s/^(Audio Blog) #71 - (Setsubun)/$1 #70 - $2/; # Wrong number # Checked 20150419 GO
$title =~ s/^(Audio Blog) #86 - (Vending Machine Evolution)/$1 #85 - $2/; # Wrong number # Checked 20150419 GO
$title =~ s/^(Audio Blog) #87 - (Playing Grown Up)/$1 #86 - $2/; # Wrong number # Checked 20150419 GO
$title =~ s/^(Audio Blog) #95 - (The Beckoning Cat)/$1 #94 - $2/; # Wrong number # Checked 20150419 GO
$title =~ s/^(Audio Blog) #103 - (Toilets in Japan: Instructions Needed!)/$1 #102 - $2/; # Wrong number # Checked 20150419 GO
$title =~ s/^(Audio Blog) #/$1 1 #/; # 1st part wasn't numbered # Checked 20150419 GO
$title =~ s/^(Advanced Audio Blog) (#35)/$1 1 $2/; # The only "Advanced" of S1... # Checked 20150419 GO
## Advanced Audio Blog 2
$title =~ s/^(Audio Blog S2 #\d+) - Tokyo Fashon Files/$1 - Tokyo Fashion Files/; # Spelling error (#8 and #11) # Checked 20150419 GO
$title =~ s/^(Audio Blog S2 #9 - Tokyo Fashion Files #3 -)/$1 /; # Space missing (note '#3:Men'->'#3 -Men') # Checked 20150419 GO
## Advanced Audio Blog 4
$title =~ s/^(Advanced Audio Blog S4 #25 - Top 10 Japanese Holidays:)/$1 /; # Space missing # Checked 20150419 GO
## Advanced Audio Blog 6
$title =~ s/^(Advanced Audio Blog S6 #\d+ - Japanese Tourist Spots)[—\-]/$1 - /; # Missing spaces, bad dash # Checked 20150419 GO
## (VARIOUS)
$title =~ s/^(Audio Blog)/Advanced $1/; # They switched to "Advanced" in the middle of S3 # Checked 20150419 GO
$title =~ s/^(Advanced Audio Blog) S(\d)/$1 $2/; # AABs don't use "Season"... # Checked 20150419 GO
# X - Bonus Courses
## Extra Fun
$title =~ s/^(Premium Lesson #7 - SS3) -/$1:/; # '-' instead of ':' # Checked 20150419 GO
$title =~ s/^(Premium Lesson #20 - SS):(16)/$1$2:/; # Misplaced ':' # Checked 20150419 GO
$title =~ s/^(Premium Lesson #25 -) (Practicing Priestess)/$1 SS21: $2/; # They forgot 'SS21:' # Checked 20150419 GO
$title =~ s/^(Premium Lesson #26 -) SS23/$1 SS22/; # SS22, not SS23 # Checked 20150419 GO
$title =~ s/^(Premium Lesson #29 -) (Do Not Drink the Water!)/$1 SS25: $2/; # They forgot 'SS25:' # Checked 20150419 GO
$title =~ s/^(Premium Lesson #33 - O-bon Kaidan:)/$1 Stairway In The Middle Of The Night/; # Title missing # Checked 20150419 GO
## Japanese Children's Songs
$title =~ s/^(Japanese Songs #)/Japanese Children's Songs #/; # Wrong name # Checked 20150419 GO
## JLPT Season 1 - Old N4/New N5
$title =~ s/^(JLPT #\d - JLPT Level 4 Last Minute Prep Course) (\d)/$1 #$2/; # Consistency for #-prefixed # Checked 20150419 GO
## JLPT Season 2 - New N4
$title =~ s/^(JLPT S2 #1 - New JLPT N4 Prep Course)/$1 #1/; # They forgot the numbering for this one # Checked 20150419 GO
## JLPT Season 3 - New N3
$title =~ s/^(JLPT S3 #((?:[89])|(?:1[13])) - New JLPT N3 Prep Course) \d+/$1 #$2/; # 9,8,11,13 miss the 2nd "#" # Checked 20150419 GO
## Inner Circle
$title =~ s/^(Inner Circle #02)/Inner Circle #2/; # Other lessons don't use zeroes # Checked 20150419 GO
# Z - News & Announcement
## News
$title =~ s/^News #52 - (JLPT Ganbatte & EnglishPod101.com!)/News #51 - $1/; # Badly numbered name # Checked 20150419 GO
$title =~ s/^(Wait on a Deal Like This - And Just Like Santa, It's Gone!)/News #68 - $1/; # Category & number missing # Checked 20150419 GO
$title =~ s/^(Get Your 2009 Lesson Schedule Here!)/News #69 - $1/; # Category & number missing # Checked 20150419 GO
$title =~ s/^News #91 - JapanesePod101\.com101\.com/News #91 - JapanesePod101.com/; # Seriously guys? # Checked 20150419 GO
# Unsorted
$title =~ s/^(Lower Intermediate Lesson S4 #24 - Munch on the Idea of Japanese) Lunch Box (Day!)/$1 Lunchbox $2/; # Harmless collisions will occur.
#$title =~ s/^Newbie Lesson S1 #24 - (How Much Will You Spend in Japan\?).*$/UNHOLY_SHIT/; # Nope. Find out why by yourself and cry.
#$title =~ s/^Newbie Lesson S1 #25 - (Where Are You Headed\?).*$/UNHOLY_SHIT/; # ...the fuck is this. PDF is fucked up.
# Global replacements
# Add "S1" to lessons without it when they have more than one or if the listing shows it.
$title =~ s/^((?:(?:Absolute|Upper|Lower) Beginner)|(?:(?:Lower )?Intermediate)|(?:Newbie)|(?:JLPT)|(?:Survival Phrases)) #/$1 S1 #/;
$title =~ s/^((?:(?:Lower |Upper )?Intermediate Lesson)|(?:(?:Beginner|Newbie) Lesson)) #/$1 S1 #/;
# When "Lesson" is forgotten... (with S\d+)
$title =~ s/^((?:Lower )?Beginner|Newbie|(?:(?:Lower |Upper )?Intermediate)) (S\d+) #/$1 Lesson $2 #/;
$title =~ s/^((?:Lower |Upper )?(?:Beginner|Intermediate|Newbie|Premium)) Lesson (S\d)/$1 $2/; # These don't use the term "Lesson"
$title =~ s/ S(\d+)/ Season $1/; # The website uses the full "Season" terminology nearly everywhere.
$title =~ s/^(Premium Lesson)/Extra Fun $1/;
$title =~ s/^Japanese Culture Class #/Japanese Culture Classes #/; # Wrong name
$title =~ s/^Kanji Video Lesson #/Kanji Videos with Hiroko #/; # Wrong name
$title =~ s/^(JLPT Season 1) #/$1 - Old 4\/New N5 #/; # Wrong name
$title =~ s/^(JLPT Season 2) #/$1 - New N4 #/; # Wrong name
$title =~ s/^(JLPT Season 3) #/$1 - New N3 #/; # Wrong name
next if $title eq 'UNHOLY_SHIT'; # We skip entries marked as shit (bad duplicates which can overwrite correct files).
my $path = q{}; # Folder where to put the file
my $filetitle = q{}; # First part of the file name (without the extension)
my $extension = substr $item->{'enclosure'}->{'url'}, rindex( $item->{'enclosure'}->{'url'}, q{.} );
# Title examples:
#Learn Japanese Grammar Video - Absolute Beginner #10 - Introduction to Adjectives in Japanese - Lesson Notes
#Beginner S6 #20 - Whenever You Have a Moment, Learn Some More Japanese! - Kanji Close-Up
#Video Culture Class: Japanese Holidays #3 - Bean-Throwing Ceremony - Video Vocab
#Japanese for Everyday Life Lower Intermediate #25 - Offering Your Help
#Kantan Kana #2: Hiragana か, き, く, け, こ
#Advanced Audio Blog #1 - Top 10 Japanese Authors: Natsume Sōseki
#How This Feed Works
my $separator_count = 0;
while ($title =~ /( - )|( #\d+[^ ] )/g) { $separator_count++ }
#print "$separator_count - $title\n" if $separator_count != 2; # Debug line.
my $section_number;
if( $conf{'hierarchy'} )
{
BROWSE: for my $level ( sort keys %hierarchy )
{
for my $chapter ( @{$hierarchy{$level}} )
{
if( $title =~ /^$chapter .* - .*$/ )
{
my $title2 = $title;
my $chapter2 = $chapter;
$section_number = substr( $level, 0, 1 );
# Using this twice is ugly, but we want to test with an unclean string and affect clean ones...
if( $conf{'unicodize'} )
{
# Replace illegal characters by their full-width Unicode equivalent
$title2 =~ tr{?*:|"<>/\\}{?*:|"<>/\};
$chapter2 =~ tr{?*:|"<>/\\}{?*:|"<>/\};
}
else
{
# Replace illegal characters
$title2 =~ s/[?:"*<>|\/]/$conf{'fallback'}/g;
$chapter2 =~ s/[?:"*<>|\/]/$conf{'fallback'}/g;
}
$title2 =~ /^(?:$chapter2) (.*)(?: - )(.*)$/;
my $lesson = $1;
$path = $level.$conf{'separator'}.$chapter2.$conf{'separator'}.$lesson;
$filetitle = $2;
last BROWSE;
}# else { print "--> $title | $chapter\n" }
}
}
}
# The section number must be contained in the 'sections' option to continue
if( defined $section_number && index( $conf{'sections'}, $section_number ) == -1 )
{
next;
}
# We also put this here for non-hierarchy
if( $conf{'unicodize'} )
{
# Replace illegal characters by their full-width Unicode equivalent
$title =~ tr{?*:|"<>/\\}{?*:|"<>/\};
}
else
{
# Replace illegal characters
$title =~ s/[?:"*<>|\/]/$conf{'fallback'}/g;
}
if( ! $path ) # When hierarchy isn't used or if it failed to create a path
{
print "WARNING: Can't sort $title\n" if $conf{'hierarchy'};
# Case for 3+ elements (premium RSS) in the title
if( $separator_count >= 2 && $title =~ $re_title_2P )
{
$path = $1.$2.$3;
$filetitle = $4;
}
# Case for 2 elements (free RSS)
elsif( $separator_count == 1 && $title =~ $re_title_1 )
{
$path = $1.$2.$3;
if( $extension eq '.mp3' )
{
$filetitle = 'Audio';
}
elsif( $extension eq '.m4v' )
{
$filetitle = 'Video';
}
}
# Case for 1 element
elsif( $separator_count == 0 )
{
$filetitle = $title;
}
else
{
warn "WARNING: Unsupported title format $title! ($separator_count separators detected)\n";
next;
}
}
# If a path ends with "..." in Windows, the dots dissapear. Yup.
$path =~ s/\.\.\.$/…/;
if( !($conf{'nofolders'}) && !(-d $path) )
{
# The working directory is already OutDir since the earlier chdir
make_path( $path ) or warn "WARNING: Folder $path couldn't be created!\n";
}
my $t = Time::Piece->strptime( $item->{'pubDate'}, '%a, %d %b %Y %T %z' ); # Mon, 12 Dec 2011 18:30:00 +0900 # http://www.unix.com/man-page/FreeBSD/3/strftime/
if( !($conf{'nosummary'}) && $item->{'itunes:summary'} )
{
my $summarylength = length($item->{'itunes:summary'});
my $skip = 0;
$skip++ if $summarylength <= 306; # "In this global economy, it\'s important to never burn bridges, and more importantly, give yourself as many opportunities as possible! And Innovative Language Learning has the key to opening those doors! The more languages you know, the more opportunities you will have - both personally and professionally." # See at the end for the other useless summaries
#print "\n$item->{'itunes:summary'}\n_______________________________________\n"
$skip++ if $summarylength == 321; # "Learn Japanese with JapanesePod101!\nDon't forget to stop by JapanesePod101.com for more great Japanese Language Learning Resources!\n\n-------Lesson Dialog-------\n\n---------------------------\nLearn Japanese with JapanesePod101!\nDon't forget to stop by JapanesePod101.com for more great Japanese Language Learning Resources!" # "Empty"
$skip++ if $summarylength == 322; # "Learn Japanese with JapanesePod101!\nDon't forget to stop by JapanesePod101.com for more great Japanese Language Learning Resources!\n\n----Vocabulary & Phrases---\n\n\n---------------------------\nLearn Japanese with JapanesePod101!\nDon't forget to stop by JapanesePod101.com for more great Japanese Language Learning Resources!\n" # "Empty"
$skip++ if $summarylength == 323; # "Learn Japanese with JapanesePod101!\nDon\'t forget to stop by JapanesePod101.com for more great Japanese Language Learning Resources!\n\n-------Lesson Dialog-------\n\n---------------------------\nLearn Japanese with JapanesePod101!\nDon\'t forget to stop by JapanesePod101.com for more great Japanese Language Learning Resources!" # Most common "empty"
$skip++ if $summarylength == 324; # "Learn Japanese with JapanesePod101!\nDon\'t forget to stop by JapanesePod101.com for more great Japanese Language Learning Resources!\n\n----Vocabulary & Phrases---\n\n\n---------------------------\nLearn Japanese with JapanesePod101!\nDon\'t forget to stop by JapanesePod101.com for more great Japanese Language Learning Resources!" # "Empty"
my $summarypath = $conf{'OutDir'}.$conf{'separator'}.$path.$conf{'separator'}.$filetitle.'.txt';
if( !($skip) && ( !(-e $summarypath) || $conf{'overwrite'} ) ) # If you don't skip and the path doesn't exist or you overwrite...
{
my $summarytext = $item->{'itunes:summary'};
$summarytext =~ s/\\'/'/g;
my $summary;
$summary = IO::File->new( $summarypath, '>:utf8' ) or die "Couldn't open $summarypath for writing: $!";
$summary->binmode( ':encoding(UTF-8)' ); # To avoid "Wide character in print" warnings
$summary->print( $summarytext );
$summary->close();
utime $t->epoch, $t->epoch, $summarypath;
}# else { print "$skip $summarylength $summarypath\n" } # Uncomment this to debug if summaries aren't written
}
my $filepath = $conf{'OutDir'}.$conf{'separator'}.$path.$conf{'separator'}.$filetitle.$extension;
if( ! exists $dl_list{$filepath} )
{
$dl_list{$filepath} = { 'url' => $item->{'enclosure'}->{'url'}, 'length' => $item->{'enclosure'}->{'length'}, 'date' => $t->epoch };
}
else
{
if( $dl_list{$filepath}->{'url'} eq $item->{'enclosure'}->{'url'} )
{
if( $dl_list{$filepath}->{'length'} == 1 && $item->{'enclosure'}->{'length'} > 1 )
{
print "WARNING: Entry '$filepath' already exists with same URL " . $item->{'enclosure'}->{'url'} .
" but with a good size... replacing the old one.\n";
$dl_list{$filepath} = { 'url' => $item->{'enclosure'}->{'url'}, 'length' => $item->{'enclosure'}->{'length'}, 'date' => $t->epoch };
}
elsif( $dl_list{$filepath}->{'length'} > 1 && $item->{'enclosure'}->{'length'} == 1 )
{
print "WARNING: Entry '$filepath' already exists with same URL " . $item->{'enclosure'}->{'url'} .
" but with a bogus size... skipping.\n" if $conf{'verbose'};
}
else # Size of "1" in both cases, or different sizes
{
print "WARNING: Entry '$filepath' already exists with same URL " . $item->{'enclosure'}->{'url'} .
"... skipping.\n" if $conf{'verbose'};
}
}
else
{
if( $conf{'duplicates'} )
{
my $filepath2 = $conf{'OutDir'}.$conf{'separator'}.$path.$conf{'separator'}.$filetitle.' DUPLICATE'.$extension;
if( exists $dl_list{$filepath2} )
{
print "WARNING: Duplicate for $filepath already saved... skipping.\n"; # Let's keep at one duplicate max for now
}
else
{
$dl_list{$filepath2} = { 'url' => $item->{'enclosure'}->{'url'}, 'length' => $item->{'enclosure'}->{'length'}, 'date' => $t->epoch };
}
}
else
{
print "WARNING: Entry '$filepath' already exists with different URL " . $item->{'enclosure'}->{'url'} .
" VS previous " . $dl_list{$filepath}->{'url'} . "... skipping.\n";
}
}
}
if( $conf{'dl_limit'} > 0 && $dl_count == $conf{'dl_limit'} )
{
print "Reached the limit of files to download ($dl_count out of $conf{'dl_limit'}).\n";
last;
}
}
if( ! $conf{'pretend'} )
{
my @sorted_paths = Unicode::Collate::->new->sort(keys %dl_list);
for my $filepath ( @sorted_paths )
{
my $proceed = 0;
if( -e $filepath )
{
my $filesize = (stat $filepath)[7];
if( $filesize != $dl_list{$filepath}->{'length'} )
{
if( $dl_list{$filepath}->{'length'} == 1 || ( $conf{'skipOredl'} && substr($filepath,-4) eq '.pdf' ) )
{
warn "WARNING: Skipping redownload of $filepath...\n" if $conf{'verbose'};
}
else
{
warn "WARNING: $filepath already exists but with size $filesize instead of ".$dl_list{$filepath}->{'length'}."... overwriting.\n";
$proceed++;
}
}
if( $conf{'overwrite'} )
{
$proceed++;
}
}
else # ...if the file doesn't exist
{
$proceed++;
}
if( $proceed )
{
print "Downloading $filepath... ";
my $response = $ua->get(
$dl_list{$filepath}->{'url'},
':content_file' => $filepath,
':read_size_hint' => $dl_list{$filepath}->{'length'},
);
if( ! $response->is_success )
{
print "Error downloading $dl_list{$filepath}->{'url'}! " . $response->status_line . "\n";
push @errors, "$dl_list{$filepath}->{'url'} -> $filepath (" . $response->status_line . ")\n";
}
utime $dl_list{$filepath}->{'date'}, $dl_list{$filepath}->{'date'}, $filepath;
print "done\n";
$dl_count++;
sleep $conf{'throttle'};
}
}
}
if( @errors )
{
print "Summary of errors:\n";
print for @errors;
}
__END__
# For your viewing pleasure, the list of summaries that won't be written to disk:
#$skip++ if $summarylength <= 15; # "##PostExcerpt##"
#$skip++ if $summarylength <= 47; # "Stop by JapanesePod101.com and leave us a post!"
#$skip++ if $summarylength <= 51; # "Taking a closer look at Kanji (Chinese characters)."
#$skip++ if $summarylength <= 58; # "Stop by JapanesePod101.com and be sure to leave us a post!"
#$skip++ if $summarylength <= 72; # "Happy Birthday JapanesePod101.com!! Stop by the site to leave us a post!"
#$skip++ if $summarylength <= 75; # "Stop by JapanesePod101.com for more great Japanese Language Learning tools!"
#$skip++ if $summarylength <= 76; # "Stop by JapanesePod101.com for the accompanying PDFs, bonus track, and more!"
#$skip++ if $summarylength <= 78; # "Stop by JapanesePod101.com for more great Japanese language learning material."
#$skip++ if $summarylength <= 80; # "Stop by JapanesePod101.com to download the video and be sure to leave us a post!"
#$skip++ if $summarylength <= 83; # "Stop by JapanesePod101.com for the accompanying PDFs, Japanese subtitles, and more!"
#$skip++ if $summarylength <= 90; # "Be sure to stop by JapanesePod101.com for more great Japanese Language Learning resources."
#$skip++ if $summarylength <= 94; # "Learn Japanese with JapanesePod101.com! Stop by JapanesePod101.com for the accompanying video!"
#$skip++ if $summarylength <= 100; # "Stop by JapanesePod101.com to download the video, accompanying PDFs, and be sure to leave us a post!"
#$skip++ if $summarylength <= 104; # "Stop by JapanesePod101.com for more details about our open content call, and be sure to leave us a post!"
#$skip++ if $summarylength <= 105; # "Stop by JapanesePod101.com to find out more about our growing community, and be sure to leave us a post."
#$skip++ if $summarylength <= 111; # "Stop by JapanesePod101.com for the accompanying PDFs, line-by-line audio, and more! Be sure to leave us a post!"
#$skip++ if $summarylength <= 118; # "Stop by JapanesePod101.com for Lesson Notes, and test yourself in the Learning Center! And be sure to leave us a post!"
#$skip++ if $summarylength <= 133; # "Stop by JapanesePod101.com and be sure to leave us a post!\nHappy New Year from everyone at JapanesePod101.com!\nよいお年を!・Yoi otoshi o!"
#$skip++ if $summarylength <= 134; # "Learn Japanese with JapanesePod101.com! After listening, stop by JapanesePod101.com and be sure to leave us a post!\n\n★☆★☆★☆★☆★☆★☆★☆★"
#$skip++ if $summarylength <= 135; # "Learn Japanese with JapanesePod101!\nDon\'t forget to stop by JapanesePod101.com for more great Japanese Language Learning Resources!\n"
#$skip++ if $summarylength <= 141; # "Learn Japanese with JapanesePod101!\nDon\'t forget to stop by JapanesePod101.com for more great Japanese Language Learning Resources!\n\n\n\n"
#$skip++ if $summarylength <= 144; # "Happy New Year to all our listeners!\n\nStop by JapanesePod101.com to find out more about our growing community, and be sure to leave us a post."
#$skip++ if $summarylength <= 151; # "Stop by JapanesePod101.com and check our newest feature Video User Guide, which walks you through JapanesePod101.com's Lesson Specific Learning Center."
#$skip++ if $summarylength <= 154; # "Stop by http://www.japanesepod101.com/ for the most comprehensive, user friendly, helpful, and interactive language learning resource on the internet!\n\n"
#$skip++ if $summarylength <= 181; # "Stop by JapanesePod101.com for the accompanying PDF and transcripts! If you enjoy the intros and getting a behind the scenes look at JapanesePod101.com, you won\'t wanna miss this!"
#$skip++ if $summarylength <= 205; # "The release of JapanesePod101.com version 2!! This is big news that you absolutely can't miss! Stop by JapanesePod101.com and check out the new page, sign up for a 7-Day Free Trial, and leave us a comment!"
#$skip++ if $summarylength <= 219; # "Happy New Year from Asakusa, Tokyo! Today we release some of the video footage from our visit to Asakusa Temple on New Year's Day. Meet Peter from the show and his one-kind-friend Ryo. This guy you don't want to miss."
#$skip++ if $summarylength <= 247; # "Increase Vocabulary and Phrases!\n\nThere is a reason we have all used flashcards at some point in our studies. The bottom line is they WORK. At http://www.japanesepod101.com/, we understand this, and offer flashcards for all levels of your study.\n\n"
#$skip++ if $summarylength <= 251; # "Learn Japanese with podcasts and Videocasts! Yes, Japanesepod101.com videocasts are back and better than ever. Today we introduce you to the Japanese taxi, which we covered in Survival Phrases #5. This is some rare footage that you don't want to miss!"
#$skip++ if $summarylength <= 255; # "Stop by JapanesePod101.com for the accompanying PDFs, line-by-line audio, and more! Be sure to leave us a post!\n\nStop by JapanesePod101.com for the most comprehensive, user-friendly, entertaining and interactive language learning resource on the Internet"
#$skip++ if $summarylength <= 258; # "Stop by JapanesePod101.com for the accompanying PDFs, line-by-line audio, and more! Be sure to leave us a post!\n\n\nStop by JapanesePod101.com for the most comprehensive, user-friendly, entertaining and interactive language learning resource on the Internet"
#$skip++ if $summarylength <= 276; # "Don\'t let another once in a lifetime opportunity pass you by! Go to www.innovativelanguage.com/FFC by midnight EST on July 26th to become a Founding Father of ThaiPod, CantoneseClass, PolishPod, GreekPod, and/or PortuguesePod101.com, and guarantee yourself 50% OFF forever!\n"
#$skip++ if $summarylength <= 285; # "Explore Japanese and Japanese Culture with Fellow Students!\n\nEngage with fellow learners and teachers and ask language related questions, culture related questions, and even leave lesson or feature requests. It is by far the most active and vibrant forum on Japanese on the Internet!\n\n"
#$skip++ if $summarylength <= 286; # "Stop by JapanesePod101.com for the accompanying PDFs, line-by-line audio, and more! Be sure to leave us a post!\n\n-------------------------\n\nStop by JapanesePod101.com for the most comprehensive, user-friendly, entertaining and interactive language learning resource on the Internet!"
#$skip++ if $summarylength <= 287; # "Improve Pronunciation, Fluency!\n\nDrastically improve your pronunciation with the voice recording tool in the premium learning center. Record your voice with a click of a button, and playback what you record just as easily. This tool is the perfect complement to the line-by-line audio.\n\n"
#$skip++ if $summarylength <= 292; # "Stop by JapanesePod101.com for the accompanying PDFs, line-by-line audio, and more! Be sure to leave us a post!\n\n-------Lesson Dialog-------\nJapanese\n\n\n----Formal English----\n\n\n---------------------------\nStop by JapanesePod101.com for the Fastest, Easiest and Most Fun Way to Learn Japanese!"
#$skip++ if $summarylength <= 302; # "Stop by JapanesePod101.com for the accompanying PDFs, line-by-line audio, and more! Be sure to leave us a post!\n\n-------Lesson Dialog-------\nJapanese\n\n\n----Formal English----\n\n\n---------------------------\nStop by JapanesePod101.com for the Fastest, Easiest and Most Fun Way to Learn Japanese!"
XML example, two items :
_______
<item>
<title>Lower Intermediate Lesson #46 - Mothers' Talk - Lesson Notes</title>
<pubDate>Thu, 11 Sep 2014 18:29:59 +0900</pubDate>
<guid>http://www.japanesepod101.com/premium_feed/pdfs/601_LI46_101807_jpod101.pdf</guid>
<enclosure url="http://www.japanesepod101.com/premium_feed/pdfs/601_LI46_101807_jpod101.pdf" length="170903" type="application/pdf"/> <itunes:author>JapanesePod101.com</itunes:author>
<itunes:explicit>No</itunes:explicit>
<itunes:block>No</itunes:block>
<itunes:duration>1:00</itunes:duration>
</item>
<item>
<title>Lower Intermediate Lesson #46 - Mothers' Talk - Audio</title>
<pubDate>Thu, 11 Sep 2014 18:30:00 +0900</pubDate>
<guid>http://media.libsyn.com/media/japanesepod101/601_LI46_101807_jpod101.mp3</guid>
<enclosure url="http://media.libsyn.com/media/japanesepod101/601_LI46_101807_jpod101.mp3" length="7469804" type="audio/mpeg"/>
<itunes:summary>Learn Japanese with JapanesePod101!
Don\'t forget to stop by JapanesePod101.com for more great Japanese Language Learning Resources!
-------Lesson Dialog-------
----Formal ----
[...text...]
---------------------------
Learn Japanese with JapanesePod101!
Don\'t forget to stop by JapanesePod101.com for more great Japanese Language Learning Resources!</itunes:summary> <itunes:author>JapanesePod101.com</itunes:author>
<itunes:explicit>No</itunes:explicit>
<itunes:block>No</itunes:block>
<itunes:duration>14:48</itunes:duration>
</item>
_______
Here the folder would be "Lower Intermediate Lesson #46 - Mothers' Talk", with files "Audio.mp3", "Audio.txt" (<itunes:summary>) and "Lesson Notes.pdf".