今天想把经典的电影拿来当听力训练,可惜找了很久都没有找到直接下载方式,没有办法,只好下载电影然后提取音频并配上字幕。网上找了一个perl版的srt2lrc.pl,因为完全不懂perl,该脚本无法运行,所以只好自己想办法。没有仔细研究过srt与lrc格式,只是找了两个文件来看,勉强知道是怎么回事,用python写了一个转化的脚本(ps. python不熟,代码很难看,只是勉强实现了功能)

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
#!/usr/bin/env python
# -*- coding: utf-8 -*-

import sys
import re

if len(sys.argv) != 2:
    print("Usage srt2lrc.py file")
    sys.exit(1)

f = open(sys.argv[1])
lines = f.readlines()
i = 0
while i < len(lines):
    line = lines[i]
    if line.find('-->') == -1:
        i = i + 1
        continue

    s = line.split('-->')
    time = s[0].split(':')
    ms = '00'
    if len(time) >= 3:
        hour = time[0]
        minute = time[1]
        seconds = time[2].split(',')
        second = seconds[0]
        ms = seconds[1]
    else:
        hour = 0
        minute = time[2]
        seconds = time[3].split(',')
        second = seconds[0]
        ms = seconds[1]

    if int(hour) == 0 and int(minute) == 0 and int(second) == 0 and int(ms) == 0:
        i = i + 1
        continue

    i = i + 1
    s = lines[i].strip()
    i = i + 1
    while i < len(lines) and len(lines[i].strip()) > 0:
        s = s + lines[i].strip()
        i = i + 1

    s = re.sub("<.*>", "", s)
    if len(s) > 0:
        print("[%.2d:%s.%.2s]%s" % (int(hour)*60+int(minute.strip()), second.strip(), ms.strip(), s))

f.close()