The data used in Behind the sound bite was extracted from the Hong Kong government's Policy Address website and put at the disposal of developers by the South China Morning Post. It is provided as-is and we give no guarantees on the accuracy of the data.
For the raw texts of addresses, every paragraph was converted to a single row of text. Whereas only the English version was used in our graphic, we also extracted the Chinese text using the same method.
Additionally, pages.csv references the pages into which the government site has split the different paragraphs of the address. Conversely, long paragraphs may also be spread out, albeit rarely, onto several pages. chapters.csv does the same thing for chapters.
-- Updated on 2014-01-16 at 10 p.m. HKT