Command line usage

CCExtractor's main program is console based. There's a GUI for Windows, as well as provisions so other programs can easily interface with CCExtractor, but the heavy lefting is done by a command line program (that can be called by scripts so integration with larger processes is straightforward).
Running CCExtractor without any parameter will display a help screen with all the options. As of version 0.88 the help screen is as follows:

          1
          CCExtractor 1.0, Carlos Fernandez Sanz, Volker Quetschke..
        
          2
          Teletext portions taken from Petr Kutalek's telxcc
        
          3
          --------------------------------------------------------------------------
        
          4
          Originally based on McPoodle's tools. Check his page for lots of information
        
          5
          on closed captions technical details.
        
          6
          (http://www.theneitherworld.com/mcpoodle/SCC_TOOLS/DOCS/SCC_TOOLS.HTML)
        
          8
          This tool home page:
        
          9
          http://www.ccextractor.org
        
          10
          Extracts closed captions and teletext subtitles from video streams.
        
          11
          (DVB, .TS, ReplayTV 4000 and 5000, dvr-ms, bttv, Tivo, Dish Network,
        
          12
          .mp4, HDHomeRun are known to work).
        
          14
          Syntax:
        
          15
          ccextractor [options] inputfile1 [inputfile2...] [-o outputfilename]
        
          17
          Arguments:
        
          18
          [inputfile]...
        
          19
          file(s) to process
        
          21
          Options:
        
          22
          -h, --help
        
          23
          Print help (see a summary with '-h')
        
          25
          -V, --version
        
          26
          Print version
        
          28
          File name related options:
        
          29
          -o 
        
          30
          Use -o parameters to define output filename if you don't
        
          31
          like the default ones (same as infile plus _1 or _2 when
        
          32
          needed and file extension, e.g. .srt).
        
          34
          --stdout
        
          35
          Write output to stdout (console) instead of file. If
        
          36
          stdout is used, then -o can't be used. Also
        
          37
          --stdout will redirect all messages to stderr (error).
        
          39
          --pesheader
        
          40
          Dump the PES Header to stdout (console). This is
        
          41
          used for debugging purposes to see the contents
        
          42
          of each PES packet header.
        
          44
          --debugdvbsub
        
          45
          Write the DVB subtitle debug traces to console
        
          47
          --ignoreptsjumps
        
          48
          Ignore PTS jumps (default)
        
          50
          --fixptsjumps
        
          51
          fix pts jumps. Use this parameter if you
        
          52
          experience timeline resets/jumps in the output.
        
          54
          --stdin
        
          55
          Reads input from stdin (console) instead of file.
        
          56
          Alternatively, - can be used instead of --stdin
        
          58
          Output File Segmentation:
        
          59
          --outinterval 
        
          62
          --segmentonkeyonly
        
          63
          When segmenting files, do it only after a I frame
        
          64
          trying to behave like FFmpeg
        
          66
          Network support:
        
          67
          --udp <[[src@]host:]port>
        
          68
          Read the input via UDP (listening in the specified port)
        
          69
          instead of reading a file.
        
          70
          Host and src can be a hostname or IPv4 address.
        
          71
          If host is not specified then listens on the local host.
        
          73
          --src 
        
          74
          Can be a hostname or IPv4 address.
        
          76
          --sendto 
        
          77
          Sends data in BIN format to the server
        
          78
          according to the CCExtractor's protocol over
        
          79
          TCP. For IPv6 use [address] instead
        
          81
          --sendto-port 
        
          82
          Specfies optional port for sendto
        
          84
          --tcp 
        
          85
          Reads the input da`ta in BIN format according to
        
          86
          CCExtractor's protocol, listening specified port on the
        
          87
          local host
        
          89
          --tcp-password 
        
          90
          Sets server password for new connections to
        
          91
          tcp server
        
          93
          --tcp-description 
        
          94
          Sends to the server short description about
        
          95
          captions e.g. channel name or file name
        
          97
          Options that affect what will be processed:
        
          98
          --output-field 
        
          99
          Values: 1 = Output Field 1
        
          100
          2 = Output Field 2
        
          101
          both = Both Output Field 1 and 2
        
          102
          Defaults to 1
        
          104
          --append
        
          105
          Use --append to prevent overwriting of existing files. The output will be
        
          106
          appended instead.
        
          108
          --cc2
        
          109
          When in srt/sami mode, process captions in channel 2
        
          110
          instead of channel 1.
        
          112
          --service 
        
          113
          Enable CEA-708 (DTVCC) captions processing for the listed
        
          114
          services. The parameter is a comma delimited list
        
          115
          of services numbers, such as "1,2" to process the
        
          116
          primary and secondary language services.
        
          117
          Pass "all" to process all services found.
        
          118
          If captions in a service are stored in 16-bit encoding,
        
          119
          you can specify what charset or encoding was used. Pass
        
          120
          its name after service number (e.g. "1[EUC-KR],3" or
        
          121
          "all[EUC-KR]") and it will encode specified charset to
        
          122
          UTF-8 using iconv. See iconv documentation to check if
        
          123
          required encoding/charset is supported.
        
          125
          Input Formats:
        
          126
          --input 
        
          127
          With the exception of McPoodle's raw format, which is just the closed
        
          128
          caption data with no other info, CCExtractor can usually detect the
        
          129
          input format correctly. Use this parameter to override the detected
        
          131
          Possible values:
        
          132
          - ts:   For Transport Streams
        
          133
          - ps:   For Program Streams
        
          134
          - es:   For Elementary Streams
        
          135
          - asf:  ASF container (such as DVR-MS)
        
          136
          - wtv:  Windows Television (WTV)
        
          137
          - bin:  CCExtractor's own binary format
        
          138
          - raw:  For McPoodle's raw files
        
          139
          - mp4:  MP4/MOV/M4V and similar
        
          140
          - m2ts: BDAV MPEG-2 Transport Stream
        
          141
          - mkv:  Matroska container and WebM
        
          142
          - mxf:  Material Exchange Format (MXF)
        
          144
          Output Formats:
        
          145
          --out 
        
          146
          Possible values:
        
          147
          - srt:         SubRip (default, so not actually needed)
        
          148
          - ass:         SubStation Alpha
        
          149
          - ssa:         SubStation Alpha
        
          150
          - ccd:         Scenarist Closed Caption Disassembly format
        
          151
          - scc:         Scenarist Closed Caption format
        
          152
          - webvtt:      WebVTT format
        
          153
          - webvtt-full: WebVTT format with styling
        
          154
          - sami:        MS Synchronized Accesible Media Interface
        
          155
          - bin:         CC data in CCExtractor's own binary format
        
          156
          - raw:         CC data in McPoodle's Broadcast format
        
          157
          - dvdraw:      CC data in McPoodle's DVD format
        
          158
          - mcc:         CC data compressed using MacCaption Format
        
          159
          - txt:         Transcript (no time codes, no roll-up captions, just the plain transcription)
        
          160
          - ttxt:        Timed Transcript (transcription with time info)
        
          161
          - g608:        Grid 608 format
        
          162
          - smptett:     SMPTE Timed Text (W3C TTML) format
        
          163
          - spupng:      Set of .xml and .png files for use with dvdauthor's spumux. See "Notes on spupng output format"
        
          164
          - null:        Don't produce any file output
        
          165
          - report:      Prints to stdout information about captions in specified input. Don't produce any file output
        
          166
          - simple-xml
        
          168
          Options that affect how input files will be processed:
        
          169
          --goptime
        
          170
          Use GOP for timing instead of PTS. This only applies
        
          171
          to Program or Transport Streams with MPEG2 data and
        
          172
          overrides the default PTS timing.
        
          173
          GOP timing is always used for Elementary Streams.
        
          175
          --no-goptime
        
          176
          Never use GOP timing (use PTS), even if ccextractor
        
          177
          detects GOP timing is the reasonable choice.
        
          179
          --fixpadding
        
          180
          Fix padding - some cards (or providers, or whatever)
        
          181
          seem to send 0000 as CC padding instead of 8080. If you
        
          182
          get bad timing, this might solve it.
        
          184
          --90090
        
          185
          Use 90090 (instead of 90000) as MPEG clock frequency.
        
          186
          (reported to be needed at least by Panasonic DMR-ES15
        
          187
          DVD Recorder)
        
          189
          --videoedited
        
          190
          By default, ccextractor will process input files in
        
          191
          sequence as if they were all one large file (i.e.
        
          192
          split by a generic, non video-aware tool. If you
        
          193
          are processing video hat was split with a editing
        
          194
          tool, use --videoedited so ccextractor doesn't try to rebuild
        
          195
          the original timing.
        
          197
          -s, --stream 
        
          198
          Consider the file as a continuous stream that is
        
          199
          growing as ccextractor processes it, so don't try
        
          200
          to figure out its size and don't terminate processing
        
          201
          when reaching the current end (i.e. wait for more
        
          202
          data to arrive). If the optional parameter secs is
        
          203
          present, it means the number of seconds without any
        
          204
          new data after which ccextractor should exit. Use
        
          205
          this parameter if you want to process a live stream
        
          206
          but not kill ccextractor externally.
        
          207
          Note: If --s is used then only one input file is
        
          208
          allowed.
        
          210
          --usepicorder
        
          211
          Use the pic_order_cnt_lsb in AVC/H.264 data streams
        
          212
          to order the CC information.  The default way is to
        
          213
          use the PTS information.  Use this switch only when
        
          214
          needed.
        
          216
          --myth
        
          217
          Force MythTV code branch.
        
          219
          --no-myth
        
          220
          Disable MythTV code branch.
        
          221
          The MythTV branch is needed for analog captures where
        
          222
          the closed caption data is stored in the VBI, such as
        
          223
          those with bttv cards (Hauppage 250 for example). This
        
          224
          is detected automatically so you don't need to worry
        
          225
          about this unless autodetection doesn't work for you.
        
          227
          --wtvconvertfix
        
          228
          This switch works around a bug in Windows 7's built in
        
          229
          software to convert *.wtv to *.dvr-ms. For analog NTSC
        
          230
          recordings the CC information is marked as digital
        
          231
          captions. Use this switch only when needed.
        
          233
          --wtvmpeg2
        
          234
          Read the captions from the MPEG2 video stream rather
        
          235
          than the captions stream in WTV files
        
          237
          --program-number 
        
          238
          In TS mode, specifically select a program to process.
        
          239
          Not needed if the TS only has one. If this parameter
        
          240
          is not specified and CCExtractor detects more than one
        
          241
          program in the input, it will list the programs found
        
          242
          and terminate without doing anything, unless
        
          243
          --autoprogram (see below) is used.
        
          245
          --autoprogram
        
          246
          If there's more than one program in the stream, just use
        
          247
          the first one we find that contains a suitable stream.
        
          249
          --multiprogram
        
          250
          Uses multiple programs from the same input stream.
        
          252
          -L, --list-tracks
        
          253
          List all tracks found in the input file and exit without
        
          254
          processing. Useful for exploring media files before extraction.
        
          256
          --datapid 
        
          257
          Don't try to find out the stream for caption/teletext
        
          258
          data, just use this one instead.
        
          260
          --datastreamtype 
        
          261
          Instead of selecting the stream by its PID, select it
        
          262
          by its type (pick the stream that has this type in
        
          263
          the PMT)
        
          265
          --streamtype 
        
          266
          Assume the data is of this type, don't autodetect. This
        
          267
          parameter may be needed if --datapid or --datastreamtype
        
          268
          is used and CCExtractor cannot determine how to process
        
          269
          the stream. The value will usually be 2 (MPEG video) or
        
          270
          6 (MPEG private data).
        
          272
          --hauppauge
        
          273
          If the video was recorder using a Hauppauge card, it
        
          274
          might need special processing. This parameter will
        
          275
          force the special treatment.
        
          277
          --mp4vidtrack
        
          278
          In MP4 files the closed caption data can be embedded in
        
          279
          the video track or in a dedicated CC track. If a
        
          280
          dedicated track is detected it will be processed instead
        
          281
          of the video track. If you need to force the video track
        
          282
          to be processed instead use this option.
        
          284
          --no-autotimeref
        
          285
          Some streams come with broadcast date information. When
        
          286
          such data is available, CCExtractor will set its time
        
          287
          reference to the received data. Use this parameter if
        
          288
          you prefer your own reference. Note: Current this only
        
          289
          affects Teletext in timed transcript with --datets.
        
          291
          --no-scte20
        
          292
          Ignore SCTE-20 data if present.
        
          294
          --webvtt-create-css
        
          295
          Create a separate file for CSS instead of inline.
        
          297
          --deblev
        
          298
          Enable debug so the calculated distance for each two
        
          299
          strings is displayed. The output includes both strings,
        
          300
          the calculated distance, the maximum allowed distance,
        
          301
          and whether the strings are ultimately considered
        
          302
          equivalent or not, i.e. the calculated distance is
        
          303
          less or equal than the max allowed.
        
          305
          --analyzevideo
        
          306
          Analyze the video stream even if it's not used for
        
          307
          subtitles. This allows to provide video information.
        
          309
          --timestamp-map
        
          310
          Enable the X-TIMESTAMP-MAP header for WebVTT (HLS)
        
          312
          Levenshtein distance:
        
          313
          --no-levdist
        
          314
          Don't attempt to correct typos with Levenshtein distance.
        
          316
          --levdistmincnt 
        
          317
          Minimum distance we always allow regardless
        
          318
          of the length of the strings.Default 2.
        
          319
          This means that if the calculated distance
        
          320
          is 0,1 or 2, we consider the strings to be equivalent.
        
          322
          --levdistmaxpct 
        
          323
          Maximum distance we allow, as a percentage of
        
          324
          the shortest string length. Default 10%.0
        
          325
          For example, consider a comparison of one string of
        
          326
          30 characters and one of 60 characters. We want to
        
          327
          determine whether the first 30 characters of the longer
        
          328
          string are more or less the same as the shortest string,
        
          329
          i.e. whether the longest string  is the shortest one
        
          330
          plus new characters and maybe some corrections. Since
        
          331
          the shortest string is 30 characters and  the default
        
          332
          percentage is 10%, we would allow a distance of up
        
          333
          to 3 between the first 30 characters.
        
          335
          Options that affect what kind of output will be produced:
        
          336
          --chapters
        
          337
          (Experimental) Produces a chapter file from MP4 files.
        
          338
          Note that this must only be used with MP4 files,
        
          339
          for other files it will simply generate subtitles file.
        
          341
          --bom
        
          342
          Append a BOM (Byte Order Mark) to output files.
        
          343
          Note that most text processing tools in linux will not
        
          344
          like BOM.
        
          346
          --no-bom
        
          347
          Do not append a BOM (Byte Order Mark) to output
        
          348
          files. Note that this may break files when using
        
          349
          Windows. This is the default in non-Windows builds.
        
          351
          --unicode
        
          352
          Encode subtitles in Unicode instead of Latin-1.
        
          354
          --utf8
        
          355
          Encode subtitles in UTF-8 (no longer needed.
        
          356
          because UTF-8 is now the default).
        
          358
          --latin1
        
          359
          Encode subtitles in Latin-1
        
          361
          --no-fontcolor
        
          362
          For .srt/.sami/.vtt, don't add font color tags.
        
          364
          --no-htmlescape
        
          365
          For .srt/.sami/.vtt, don't covert html unsafe character
        
          367
          --no-typesetting
        
          368
          For .srt/.sami/.vtt, don't add typesetting tags.
        
          370
          --trim
        
          371
          Trim lines.
        
          373
          --defaultcolor 
        
          374
          Select a different default color (instead of
        
          375
          white). This causes all output in .srt/.smi/.vtt
        
          376
          files to have a font tag, which makes the files
        
          377
          larger. Add the color you want in RGB, such as
        
          378
          --defaultcolor #FF0000 for red.
        
          380
          --sentencecap
        
          381
          Sentence capitalization. Use if you hate
        
          382
          ALL CAPS in subtitles.
        
          384
          --capfile 
        
          385
          Add the contents of 'file' to the list of words
        
          386
          that must be capitalized. For example, if file
        
          387
          is a plain text file that contains
        
          389
          Tony
        
          390
          Alan
        
          392
          Whenever those words are found they will be written
        
          393
          exactly as they appear in the file.
        
          394
          Use one line per word. Lines starting with # are
        
          395
          considered comments and discarded.
        
          397
          --kf
        
          398
          Censors profane words from subtitles.
        
          400
          --profanity-file 
        
          401
          Add the contents of  to the list of words that.
        
          402
          must be censored. The content of , follows the
        
          403
          same syntax as for the capitalization file
        
          405
          --splitbysentence
        
          406
          Split output text so each frame contains a complete
        
          407
          sentence. Timings are adjusted based on number of
        
          408
          characters
        
          410
          --unixts 
        
          411
          For timed transcripts that have an absolute date
        
          412
          instead of a timestamp relative to the file start), use
        
          413
          this time reference (UNIX timestamp). 0 => Use current
        
          414
          system time.
        
          415
          ccextractor will automatically switch to transport
        
          416
          stream UTC timestamps when available.
        
          418
          --datets
        
          419
          In transcripts, write time as YYYYMMDDHHMMss,ms.
        
          421
          --sects
        
          422
          In transcripts, write time as ss,ms
        
          424
          --ucla
        
          425
          Transcripts are generated with a specific format
        
          426
          that is convenient for a specific project, feel
        
          427
          free to play with it but be aware that this format
        
          428
          is really live - don't rely on its output format
        
          429
          not changing between versions.
        
          431
          --latrusmap
        
          432
          Map Latin symbols to Cyrillic ones in special cases
        
          433
          of Russian Teletext files (issue #1086)
        
          435
          --ttxtforcelatin
        
          436
          Force Latin G0 charset for Teletext, ignoring any Cyrillic
        
          437
          designation in the stream. Use when broadcasts incorrectly
        
          438
          signal Cyrillic but content is Latin (issue #1395)
        
          440
          --xds
        
          441
          In timed transcripts, all XDS information will be saved
        
          442
          to the output file.
        
          444
          --lf
        
          445
          Use LF (UNIX) instead of CRLF (DOS, Windows) as line
        
          446
          terminator.
        
          448
          --df
        
          449
          For MCC Files, force dropframe frame count.
        
          451
          --autodash
        
          452
          Based on position on screen, attempt to determine
        
          453
          the different speakers and a dash (-) when each
        
          454
          of them talks (.srt/.vtt only, --trim required).
        
          456
          --xmltv 
        
          457
          produce an XMLTV file containing the EPG data from
        
          458
          the source TS file. Mode: 1 = full output
        
          459
          2 = live output. 3 = both
        
          461
          --xmltvliveinterval 
        
          462
          interval of x seconds between writing live mode xmltv output.
        
          464
          --xmltvoutputinterval 
        
          465
          interval of x seconds between writing full file xmltv output.
        
          467
          --xmltvonlycurrent
        
          468
          Only print current events for xmltv output.
        
          470
          --sem
        
          471
          Create a .sem file for each output file that is open
        
          472
          and delete it on file close.
        
          474
          --dvblang 
        
          475
          For DVB subtitles, select which language's caption
        
          476
          stream will be processed. e.g. 'eng' for English.
        
          477
          If there are multiple languages, only this specified
        
          478
          language stream will be processed (default).
        
          480
          --ocrlang 
        
          481
          Manually select the name of the Tesseract .traineddata
        
          482
          file. Helpful if you want to OCR a caption stream of
        
          483
          one language with the data of another language.
        
          484
          e.g. '-dvblang chs --ocrlang chi_tra' will decode the
        
          485
          Chinese (Simplified) caption stream but perform OCR
        
          486
          using the Chinese (Traditional) trained data
        
          487
          This option is also helpful when the traineddata file
        
          488
          has non standard names that don't follow ISO specs
        
          490
          --quant 
        
          491
          How to quantize the bitmap before passing it to tesseract
        
          492
          for OCR'ing.
        
          493
          0: Don't quantize at all.
        
          494
          1: Use CCExtractor's internal function (default).
        
          495
          2: Reduce distinct color count in image for faster results.
        
          497
          --oem 
        
          498
          Select the OEM mode for Tesseract.
        
          499
          Available modes :
        
          500
          0: OEM_TESSERACT_ONLY - the fastest mode.
        
          501
          1: OEM_LSTM_ONLY - use LSTM algorithm for recognition.
        
          502
          2: OEM_TESSERACT_LSTM_COMBINED - both algorithms.
        
          503
          Default value depends on the tesseract version linked :
        
          504
          Tesseract v3 : default mode is 0,
        
          505
          Tesseract v4 : default mode is 1.
        
          507
          --psm 
        
          508
          Select the PSM mode for Tesseract.
        
          509
          Available Page segmentation modes:
        
          510
          0    Orientation and script detection (OSD) only.
        
          511
          1    Automatic page segmentation with OSD.
        
          512
          2    Automatic page segmentation, but no OSD, or OCR.
        
          513
          3    Fully automatic page segmentation, but no OSD. (Default)
        
          514
          4    Assume a single column of text of variable sizes.
        
          515
          5    Assume a single uniform block of vertically aligned text.
        
          516
          6    Assume a single uniform block of text.
        
          517
          7    Treat the image as a single text line.
        
          518
          8    Treat the image as a single word.
        
          519
          9    Treat the image as a single word in a circle.
        
          520
          10    Treat the image as a single character.
        
          521
          11    Sparse text. Find as much text as possible in no particular order.
        
          522
          12    Sparse text with OSD.
        
          523
          13    Raw line. Treat the image as a single text line,
        
          524
          bypassing hacks that are Tesseract-specific.
        
          526
          --mkvlang 
        
          527
          For MKV subtitles, select which language's caption
        
          528
          stream will be processed. e.g. 'eng' for English.
        
          529
          Language codes can be either the 3 letters bibliographic
        
          530
          ISO-639-2 form (like "fre" for french) or a language
        
          531
          code followed by a dash and a country code for specialities
        
          532
          in languages (like "fre-ca" for Canadian French).
        
          534
          --no-spupngocr
        
          535
          When processing DVB don't use the OCR to write the text as
        
          536
          comments in the XML file.
        
          538
          --font 
        
          539
          Specify the full path of the font that is to be used when
        
          540
          generating SPUPNG files. If not specified, you need to
        
          541
          have the default font installed (Helvetica for macOS, Calibri
        
          542
          for Windows, and Noto for other operating systems at their
        
          543
          default location)
        
          545
          --italics 
        
          546
          Specify the full path of the italics font that is to be used when
        
          547
          generating SPUPNG files. If not specified, you need to
        
          548
          have the default font installed (Helvetica Oblique for macOS, Calibri Italic
        
          549
          for Windows, and NotoSans Italic for other operating systems at their
        
          550
          default location)
        
          552
          Options that affect how ccextractor reads and writes (buffering):
        
          553
          --bufferinput
        
          554
          Forces input buffering.
        
          556
          --no-bufferinput
        
          557
          Disables input buffering.
        
          559
          --buffersize 
        
          560
          Specify a size for reading, in bytes (suffix with K or
        
          561
          or M for kilobytes and megabytes). Default is 16M.
        
          563
          --koc
        
          564
          keep-output-close. If used then CCExtractor will close
        
          565
          the output file after writing each subtitle frame and
        
          566
          attempt to create it again when needed.
        
          568
          --forceflush
        
          569
          Flush the file buffer whenever content is written.
        
          571
          --dru
        
          572
          Direct Roll-Up. When in roll-up mode, write character by
        
          573
          character instead of line by line. Note that this
        
          574
          produces (much) larger files.
        
          576
          --no-rollup
        
          577
          If you hate the repeated lines caused by the roll-up
        
          578
          emulation, you can have ccextractor write only one
        
          579
          line at a time, getting rid of these repeated lines.
        
          581
          --ru1
        
          582
          roll-up captions can consist of 2, 3 or 4 visible
        
          583
          lines at any time (the number of lines is part of
        
          584
          the transmission). If having 3 or 4 lines annoys
        
          585
          you you can use --ru to force the decoder to always
        
          586
          use 1, 2 or 3 lines. Note that 1 line is not
        
          587
          a real mode rollup mode, so CCExtractor does what
        
          588
          it can.
        
          589
          In --ru1 the start timestamp is actually the timestamp
        
          590
          of the first character received which is possibly more
        
          591
          accurate.
        
          593
          --ru2
        
          596
          --ru3
        
          599
          Options that affect timing:
        
          600
          --delay 
        
          601
          For srt/sami/webvtt, add this number of milliseconds to
        
          602
          all times. For example, --delay 400 makes subtitles
        
          603
          appear 400ms late. You can also use negative numbers
        
          604
          to make subs appear early.
        
          606
          Options that affect what segment of the input file(s) to process:
        
          607
          --startat 
        
          608
          Only write caption information that starts after the
        
          609
          given time.
        
          610
          Time can be seconds, MM:SS or HH:MM:SS.
        
          611
          For example, --startat 3:00 means 'start writing from
        
          612
          minute 3.
        
          614
          --endat 
        
          615
          Stop processing after the given time (same format as
        
          616
          --startat).
        
          617
          The --startat and --endat options are honored in all
        
          618
          output formats.  In all formats with timing information
        
          619
          the times are unchanged.
        
          621
          --screenfuls 
        
          622
          Write 'num' screenfuls and terminate processing.
        
          624
          Options that affect which codec is to be used have to be searched in input:
        
          625
          --codec 
        
          626
          --codec dvbsub
        
          627
          select the dvb subtitle from all elementary stream,
        
          628
          if stream of dvb subtitle type is not found then
        
          629
          nothing is selected and no subtitle is generated
        
          630
          --codec teletext
        
          631
          select the teletext subtitle from elementary stream
        
          633
          [possible values: dvbsub, teletext]
        
          635
          --no-codec 
        
          636
          --no-codec dvbsub
        
          637
          ignore dvb subtitle and follow default behaviour
        
          638
          --no-codec teletext
        
          639
          ignore teletext subtitle
        
          641
          [possible values: dvbsub, teletext]
        
          643
          Adding start and end credits:
        
          644
          --startcreditstext 
        
          645
          Write this text as start credits. If there are
        
          646
          several lines, separate them with the
        
          647
          characters \n, for example Line1\nLine 2.
        
          649
          --startcreditsnotbefore 
        
          650
          Don't display the start credits before this
        
          651
          time (S, or MM:SS). Default: 0
        
          653
          --startcreditsnotafter 
        
          654
          Don't display the start credits after this
        
          655
          time (S, or MM:SS). Default: 5:00
        
          657
          --startcreditsforatleast 
        
          658
          Start credits need to be displayed for at least
        
          659
          this time (S, or MM:SS). Default: 2
        
          661
          --startcreditsforatmost 
        
          662
          Start credits should be displayed for at most
        
          663
          this time (S, or MM:SS). Default: 5
        
          665
          --endcreditstext 
        
          666
          Write this text as end credits. If there are
        
          667
          several lines, separate them with the
        
          668
          characters \n, for example Line1\nLine 2.
        
          670
          --endcreditsforatleast 
        
          671
          End credits need to be displayed for at least
        
          672
          this time (S, or MM:SS). Default: 2
        
          674
          --endcreditsforatmost 
        
          675
          End credits should be displayed for at most
        
          676
          this time (S, or MM:SS). Default: 5
        
          678
          Options that affect debug data:
        
          679
          --debug
        
          680
          Show lots of debugging output.
        
          682
          --608
        
          683
          Print debug traces from the EIA-608 decoder.
        
          684
          If you need to submit a bug report, please send
        
          685
          the output from this option.
        
          687
          --708
        
          688
          Print debug information from the (currently
        
          689
          in development) EIA-708 (DTV) decoder.
        
          691
          --goppts
        
          692
          Enable lots of time stamp output.
        
          694
          --xdsdebug
        
          695
          Enable XDS debug data (lots of it).
        
          697
          --vides
        
          698
          Print debug info about the analysed elementary
        
          699
          video stream.
        
          701
          --cbraw
        
          702
          Print debug trace with the raw 608/708 data with
        
          703
          time stamps.
        
          705
          --no-sync
        
          706
          Disable the syncing code.  Only useful for debugging
        
          707
          purposes.
        
          709
          --fullbin
        
          710
          Disable the removal of trailing padding blocks
        
          711
          when exporting to bin format.  Only useful for
        
          712
          for debugging purposes.
        
          714
          --parsedebug
        
          715
          Print debug info about the parsed container
        
          716
          file. (Only for TS/ASF files at the moment.)
        
          718
          --parsePAT
        
          719
          Print Program Association Table dump.
        
          721
          --parsePMT
        
          722
          Print Program Map Table dump.
        
          724
          --dumpdef
        
          725
          Hex-dump defective TS packets.
        
          727
          --investigate-packets
        
          728
          If no CC packets are detected based on the PMT, try
        
          729
          to find data in all packets by scanning.
        
          731
          Teletext related options:
        
          732
          --tpage 
        
          733
          Use this page for subtitles (if this parameter
        
          734
          is not used, try to autodetect). In Spain the
        
          735
          page is always 888, may vary in other countries.
        
          737
          --tverbose
        
          738
          Enable verbose mode in the teletext decoder.
        
          740
          --teletext
        
          741
          Force teletext mode even if teletext is not detected.
        
          742
          If used, you should also pass --datapid to specify
        
          743
          the stream ID you want to process.
        
          745
          --no-teletext
        
          746
          Disable teletext processing. This might be needed
        
          747
          for video streams that have both teletext packets
        
          748
          and CEA-608/708 packets (if teletext is processed
        
          749
          then CEA-608/708 processing is disabled).
        
          751
          Transcript customizing options:
        
          752
          --customtxt 
        
          753
          Use the passed format to customize the (Timed) Transcript
        
          754
          output. The format must be like this: 1100100 (7 digits).
        
          755
          These indicate whether the next things should be
        
          756
          displayed or not in the (timed) transcript. They
        
          757
          represent (in order):
        
          758
          - Display start time
        
          759
          - Display end time
        
          760
          - Display caption mode
        
          761
          - Display caption channel
        
          762
          - Use a relative timestamp ( relative to the sample)
        
          763
          - Display XDS info
        
          764
          - Use colors
        
          765
          Examples:
        
          766
          0000101 is the default setting for transcripts
        
          767
          1110101 is the default for timed transcripts
        
          768
          1111001 is the default setting for --ucla
        
          769
          Make sure you use this parameter after others that might
        
          770
          affect these settings (--out, --ucla, --xds, --txt,
        
          771
          --ttxt ...)
        
          773
          Communication with other programs and console output:
        
          774
          --gui-mode-reports
        
          775
          Report progress and interesting events to stderr
        
          776
          in a easy to parse format. This is intended to be
        
          777
          used by other programs. See docs directory for.
        
          778
          details.
        
          780
          --no-progress-bar
        
          781
          Suppress the output of the progress bar
        
          783
          --quiet
        
          784
          Don't write any message.
        
          786
          Burned-in subtitle extraction:
        
          787
          --hardsubx
        
          788
          Enable the burned-in subtitle extraction subsystem.
        
          790
          NOTE: This is needed to use the below burned-in
        
          791
          subtitle extractor options
        
          793
          --tickertext
        
          794
          Search for burned-in ticker text at the bottom of
        
          795
          the screen.
        
          797
          --ocr-mode 
        
          798
          Set the OCR mode to either frame-wise, word-wise
        
          799
          or letter wise.
        
          800
          e.g. --ocr-mode frame (default), --ocr-mode word,
        
          801
          --ocr-mode letter
        
          803
          --subcolor 
        
          804
          Specify the color of the subtitles
        
          805
          Possible values are in the set
        
          806
          {white,yellow,green,cyan,blue,magenta,red}.
        
          807
          Alternatively, a custom hue value between 1 and 360
        
          808
          may also be specified.
        
          809
          e.g. --subcolor white or --subcolor 270 (for violet).
        
          810
          Refer to an HSV color chart for values.
        
          812
          --min-sub-duration 
        
          813
          Specify the minimum duration that a subtitle line
        
          814
          must exist on the screen.
        
          815
          The value is specified in seconds.
        
          816
          A lower value gives better results, but takes more
        
          817
          processing time.
        
          818
          The recommended value is 0.5 (default).
        
          819
          e.g. --min-sub-duration 1.0 (for a duration of 1 second)
        
          821
          --detect-italics
        
          822
          Specify whether italics are to be detected from the
        
          823
          OCR text.
        
          824
          Italic detection automatically enforces the OCR mode
        
          825
          to be word-wise
        
          827
          --conf-thresh 
        
          828
          Specify the classifier confidence threshold between
        
          829
          1 and 100.
        
          830
          Try and use a threshold which works for you if you get
        
          831
          a lot of garbage text.
        
          832
          e.g. --conf-thresh 50
        
          834
          --whiteness-thresh 
        
          835
          For white subtitles only, specify the luminance
        
          836
          threshold between 1 and 100
        
          837
          This threshold is content dependent, and adjusting
        
          838
          values may give you better results
        
          839
          Recommended values are in the range 80 to 100.
        
          840
          The default value is 95
        
          842
          --hcc
        
          843
          This option will be used if the file should have both
        
          844
          closed captions and burned in subtitles
        
          846
          An example command for burned-in subtitle extraction is as follows:
        
          847
          ccextractor video.mp4 --hardsubx --subcolor white --detect-italics --whiteness-thresh 90 --conf-thresh 60
        
          849
          Notes on File name related options:
        
          850
          You can pass as many input files as you need. They will be processed in order.
        
          851
          If a file name is suffixed by +, ccextractor will try to follow a numerical
        
          852
          sequence. For example, DVD001.VOB+ means DVD001.VOB, DVD002.VOB and so on
        
          853
          until there are no more files.
        
          854
          Output will be one single file (either raw or srt). Use this if you made your
        
          855
          recording in several cuts (to skip commercials for example) but you want one
        
          856
          subtitle file with contiguous timing.
        
          858
          Notes on Options that affect what will be processed:
        
          859
          In general, if you want English subtitles you don't need to use these options
        
          860
          as they are broadcast in field 1, channel 1. If you want the second language
        
          861
          (usually Spanish) you may need to try -2, or -cc2, or both.
        
          863
          Notes on Levenshtein distance:
        
          864
          When processing teletext files CCExtractor tries to correct typos by
        
          865
          comparing consecutive lines. If line N+1 is almost identical to line N except
        
          866
          for minor changes (plus next characters) then it assumes that line N that a
        
          867
          typo that was corrected in N+1. This is currently implemented in teletext
        
          868
          because it's where samples files that could benefit from this were available.
        
          869
          You can adjust, or disable, the algorithm settings with the following
        
          870
          parameters.
        
          872
          Notes on times:
        
          873
          --startat and --endat times are used first, then -delay.
        
          874
          So if you use --srt -startat 3:00 --endat 5:00 --delay 120000, ccextractor will
        
          875
          generate a .srt file, with only data from 3:00 to 5:00 in the input file(s)
        
          876
          and then add that (huge) delay, which would make the final file start at
        
          877
          5:00 and end at 7:00.
        
          879
          Notes on codec options:
        
          880
          If codec type is not selected then first elementary stream suitable for
        
          881
          subtitle is selected, please consider --teletext -noteletext override this
        
          882
          option.
        
          883
          no-codec and codec parameter must not be same if found to be same
        
          884
          then parameter of no-codec is ignored, this flag should be passed
        
          885
          once, more then one are not supported yet and last parameter would
        
          886
          taken in consideration
        
          888
          Notes on adding credits:
        
          889
          CCExtractor can _try_ to add a custom message (for credits for example) at
        
          890
          the start and end of the file, looking for a window where there are no
        
          891
          captions. If there is no such window, then no text will be added.
        
          892
          The start window must be between the times given and must have enough time
        
          893
          to display the message for at least the specified time.
        
          895
          Notes on the CEA-708 decoder:
        
          896
          By default, ccextractor now extracts both CEA-608 and CEA-708 subtitles
        
          897
          if they are present in the input. This results in two output files: one
        
          898
          for CEA-608 and one for CEA-708.
        
          899
          To extract only CEA-608 subtitles, use -1, -2, or -12.
        
          900
          To extract only CEA-708 subtitles, use -svc.
        
          901
          To extract both CEA-608 and CEA-708 subtitles, use both -1/-2/-12 and -svc.
        
          902
          While it is starting to be useful, it's
        
          903
          a work in progress. A number of things don't work yet in the decoder
        
          904
          itself, and many of the auxiliary tools (case conversion to name one)
        
          905
          won't do anything yet. Feel free to submit samples that cause problems
        
          906
          and feature requests.
        
          908
          Notes on spupng output format:
        
          909
          One .xml file is created per output field. A set of .png files are created in
        
          910
          a directory with the same base name as the corresponding .xml file(s), but with
        
          911
          a .d extension. Each .png file will contain an image representing one caption
        
          912
          and named subNNNN.png, starting with sub0000.png.
        
          913
          For example, the command:
        
          914
          ccextractor --out=spupng input.mpg
        
          915
          will create the files:
        
          916
          input.xml
        
          917
          input.d/sub0000.png
        
          918
          input.d/sub0001.png
        
          919
          ...
        
          920
          The command:
        
          921
          ccextractor --out=spupng -o /tmp/output --output-field both input.mpg
        
          922
          will create the files:
        
          923
          /tmp/output_1.xml
        
          924
          /tmp/output_1.d/sub0000.png
        
          925
          /tmp/output_1.d/sub0001.png
        
          926
          ...
        
          927
          /tmp/output_2.xml
        
          928
          /tmp/output_2.d/sub0000.png
        
          929
          /tmp/output_2.d/sub0001.png
        
          930
          ...
...
not set