======= GSOC 2015 Documentation =======
Technical Documentation
Transport Stream
Data structure in Transport Stream made for Multiprogram.
Decoder
In CCextractor we have single Decoder Initialization function,
struct lib_cc_decode* init_cc_decode (struct ccx_decoders_common_settings_t *setting)
where settings of decoder are passed in its parameter
struct ccx_decoders_common_settings_t {
LLONG subs_delay; // ms to delay (or advance) subs
enum ccx_output_format output_format; // What kind of output format should be used?
int fix_padding; // Replace 0000 with 8080 in HDTV (needed for some cards)
struct ccx_boundary_time extraction_start, extraction_end; // Segment we actually process
int cc_to_stdout;
int extract; // Extract 1st, 2nd or both fields
int fullbin; // Disable pruning of padding cc blocks
struct ccx_decoder_608_settings *settings_608; // Contains the settings for the 608 decoder.
ccx_decoder_dtvcc_settings_t *settings_dtvcc; //Same for cea 708 captions decoder (dtvcc)
int cc_channel; // Channel we want to dump in srt mode
unsigned send_to_srv;
unsigned int hauppauge_mode; // If 1, use PID=1003, process specially and so on
int program_number;
enum ccx_code_type codec;
void *private_data;
};
2
In settings subs\_delay is deprecated and should not be used further,
3
subs\_delay parameter was added for transition period and need to be
4
removed. same thing for send\_to\_srv, output\_format.
6
If Demuxer has already initialized deocder like in case of teletext and
7
dvbsub at the time of demuxing then private\_data can be filled with
8
context of specific format decoder and initialize value enum
9
ccx\_code\_type codec.
11
### How to use multiprogram extraction using CCExtractor?
13
For Command line user its simply passing -multiprogram argument to
14
ccextractor `cextractor -multiprogram `
16
### How to use OCR for extracting bitmap subtitle in text format?
18
For doing OCR on bitmap image of subtitles, compile code after enabling
19
OCR in source code. While compiling ccextractor, follow OCR.txt in doc
20
folder of ccextractor source code.
22
### Link list data structure in CCextractor
24
Background
25
----------
27
Link List implemented in CCextractor is taken from Linux Source Code.
28
Please note ccextractor Link list does not have same syntax as it is in
29
linux source code but its similar and only changes that you will in
30
CCextractor link list would be adaptation of Windows environment.
32
Implementation
33
--------------
35
For implementing link list you need head of link list using which you
36
can always traverse update or delete the complete link list. Keep head
37
of link List in safe location, most common mistake developer do is
38
keeping head inside node structure and do memory leakage if loosing Head
39
of link list.
41
for example we need multiple decoders to extract different Subtitles
42
from different programs. so as discussed above Head of link list should
43
not be kept in Decoder Context. It must be in its parent which cant die
44
or deleted before its child. We will keep head of decoder link list in
45
CCextractor library Context which contain all demuxer, decoder and
46
encoder. Following would be syntax for keeping decoder List in Library
47
Context.
49
struct lib\_ccx\_ctx {
51
` struct list_head dec_ctx_head;`
53
}; ~~~
55
now next thing to do is initialize the head of link list, for decoder
56
link list we would initialize it in initialization part of library
57
void init\_libraries(void) {
59
` INIT_LIST_HEAD(&ctx->dec_ctx_head);`
61
} ~~~
63
Now in your decoder context put connector of link list, which will
64
actually make your dec\_ctx as node `struct lib_cc_decode` `{`
65
`struct list_head list;` `};`
67
Put your decoder node back in link list, in below code I have assumed
68
that you allocated decoder Context and saved in variable dec\_ctx and
69
your library with head of your link list is saved in ctx variable.
70
`list_add_tail( &(dec_ctx->list), &(ctx->dec_ctx_head) );`
72
One of the reason that people prefer Array because traversing,
73
searching, updating and deleting them is very easy, now that excuse wont
74
work, Now only reason to use Array would be contiguous memory
75
allocation.
77
Traversing Link List for searching and updating some parameter
78
list\_for\_each\_entry(dec\_ctx, &ctx-\>dec\_ctx\_head, list, struct
79
lib\_cc\_decode) {
81
` //Access your parameter here`\
82
` print(dec_ctx->program_number);`
84
} ~~~
86
there is different code for traversing link list when you might delete
87
complete node of link list while traversing. also mind that delete link
88
of node from link list first after then delete or free your memory
89
allocated for node list\_for\_each\_entry\_safe(dec\_ctx,
90
dec\_ctx1, &lctx-\>dec\_ctx\_head, list, struct lib\_cc\_decode) {
92
` list_del(&dec_ctx->list);`\
93
` free(dec_ctx);`
95
} ~~~
97
##### How to evaluate?
99
Clone my repository from git hub in **any** directory of gsoc server
100
`git clone --depth `[`https://github.com/anshul1912/ccextractor.git`](https://github.com/anshul1912/ccextractor.git)
102
Run CCextractor with multiprogram in argument
103
`ccextractor /repository/newRepository/TestFiles/General/Closedcaption_atsc_multiprog.ts -multiprogram -o a.srt`
105
In file specified in above command there are 6 program with closed
106
caption. but ccextractor tries to extract Closed caption from all 8
107
programs therefore it makes 2 empty files. you would have following file
108
in directory where command was executed.
109
`[anshul@gsocdev linux]$ ls a_*`
110
`a_1.srt a_2.srt a_3.srt a_4.srt a_5.srt a_6.srt a_7.srt a_8.srt`
112
From same file extract subtitles individually using -pn argument. you
113
can use following command to extract each program.
114
`ccextractor /repository/newRepository/TestFiles/General/Closedcaption_atsc_multiprog.ts -pn 1 -o pn_1.srt`
115
`ccextractor /repository/newRepository/TestFiles/General/Closedcaption_atsc_multiprog.ts -pn 2 -o pn_2.srt`
116
`ccextractor /repository/newRepository/TestFiles/General/Closedcaption_atsc_multiprog.ts -pn 3 -o pn_3.srt`
117
`ccextractor /repository/newRepository/TestFiles/General/Closedcaption_atsc_multiprog.ts -pn 4 -o pn_4.srt`
118
`ccextractor /repository/newRepository/TestFiles/General/Closedcaption_atsc_multiprog.ts -pn 5 -o pn_5.srt`
119
`ccextractor /repository/newRepository/TestFiles/General/Closedcaption_atsc_multiprog.ts -pn 6 -o pn_6.srt`
121
Now check the difference between files generated by ccextractor with
122
multiprogram and pn. for example you can use command line tool like
123
following. `diff a_1.srt pn_1.srt`
125
##### Contribution for blog
127
Now CCextractor have feature to extract Closed Caption from all channels
128
simultaneously. Its not just about extracting Closed caption from all
129
channel but also converting them to desirable format.
131
Now there is no need of multiple Capture Device to capture single live
132
closed caption. Use this wonderful openSource software and save your
133
hard earn money to donate in technology and make this software more
134
wonderful.
136
This Multiprogram Closed caption Extraction works for DVBSub, Teletext
137
and Closed Caption. which means whichever is your country Multi programs
138
Extraction would work.
140
If in some peoples country multiprogram subtitles extraction is still
141
not working, we would say invest here and help this newer
143
`generation to do more wonderful and innovative work instead of discovering or inventing Wheel`
145
##### Addendum
147
= My Graduation would be completed this year, so I would seek the
148
opportunity to be mentor or co-mentor in CCExtractor. CCExtractor is
149
great tool which can be used for various multimedia application, so my
150
development would continue in this project at least till C has its
151
charm. Since the part of work that I have done in CCextractor was done
152
with atmost care according to my knowledge, therefore I would try to
153
remove any bug in part of my code reported by someone else or
154
encountered by me.