mailing-list for TeXmacs Users

Text archives Help


Re: [TeXmacs] Large document size


Chronological Thread 
  • From: "Sam Liddicott" <address@hidden>
  • To: address@hidden
  • Subject: Re: [TeXmacs] Large document size
  • Date: Thu, 12 May 2011 16:38:17 +0100
  • Envelope-to:
  • Face: iVBORw0KGgoAAAANSUhEUgAAADAAAAAwCAMAAABg3Am1AAAAAXNSR0IArs4c6QAAADxQTFRF NTdQY2Z/i286eHFugnNXoJBwm5GCs5VeoqO0ua+fwLCQ4c2q19C81NDF4eLr/unQ/PTf/fXW+/rq /f/7XKo76wAAAAlwSFlzAAALEwAACxMBAJqcGAAAAi9JREFUSMetloGSqyAMRVsFNAQkxP//15cE 62rFdt/MZqbTVu/hBhLEx/qf8fgDIAN455wHyPk7wCIe9nA+fwGSH87hgD8AHPfRpy2GwdMtcBh+ mg4Ech9Ih+RNixJKDLEL0GF8VLkSojePHrDlryNijbFW8zBg8PUKbBOI6D1WCV5Rf2DzxCtgBlMs tRRVW5LyVW3ew1TfAZZ6RcQicpb4yZmaC74DJADK6BdATOrUAbJzsZie+K22ZnEBknNYysvhVH6u 1X8CqAe4DuANoPoOrAbsHXUARJ5knc5z4EyyTnHoAqj7xvkYIdk65AwhyEZKMusOIDV2QQQaCnDQ gBiDR3RdIEb5CBESt2wgQJI0Yx/Agphg1MgNsD+QkszM9wHVPx/PeQOW+fmcR9DF8+4CyCJhCuP8 fDzmBbLqs8JzMODqYNfDvKhoGYM2lDosy5jlRozccRAgrPM8iygQS3Mv87IugUgKlC7NZ6liuy4r uresVq5QuQAr09ZKEga8OkluyKXeQ6Bs3SrtqgC1pWLNqAfwDhBnB+sK8OrVL4D4h1B4zWPWhJhu gPUHKA5EwSMYUMsdQK+UKDhIOTSHeg+8HCoTOGtB/gwcl4mSdGybwXegPWhM/RHYKwchkckhUNMX 6j7ueQNScLbZZOPRxaADyO2UQCJpl7wbnE6gQ+ksWjpngxOwF1tF5ecP3R6KB+AQxPfHLncQ+nKw 06fhu68OTIfJ869eTtg22l+9zfwDK3mKl5BFHMYAAAAASUVORK5CYII=


The problem seems to be that since I re-build TeXmacs under Ubuntu 11.04, qt_image_to_eps is being used to generate the eps in image_to_eps() file image_files.cpp

I disabled that particular function and now the "convert" program from imagemagick is being used, so it is interesting to note that USE_GS is not defined.

Convert is better than QT and produces 158MB of .ps compared to gs's 200MB and QT's 280MB

This 158MB .ps shrinks down to 2.6MB

So the culprit is qt_image_to_eps() on Ubuntu 11.04 doing something too clever that results in enormous files which often render badly.

I'm also willing to swear in a very general sort of way that I don't think that -dPDFSETTINGS=/printer or -dPDFSETTINGS=/screen made any difference to final pdf from the .ps generated with qt_image_to_eps() - but when I take away qt_image_to_eps() the -dPDFSETTINGS makes a difference to the size of the pdf and quality of embedded images.

I propose quite strongly that qt_image_to_eps() not be used.

Sam


On 12/05/11 16:00, Sam Liddicott wrote:


With the help of this script running in my .TeXmacs/system/tmp folder:
while : ; do ln *eps x/ ; ls -l x ; sleep 0.1 ; done

I've got a load of the .eps files.

I note that the eps files fed to gs on Ubuntu 11.04 are usually larger than those fed to gs on 10.04, and have a different form.

Those on 11.04 have the bitmap defined the with loads of lines like this:
138 140 r5
7420 5980 10 10 rf
142 148 r5
7430 5980 10 10 rf
142 140 r5
7440 5980 10 10 rf
142 148 r5

Those on 10.04 have the bitmap defined like this:
colorimage
[B?XWa2ZEF`...
....

which appears to be perhaps a based64 style encoding of the bitmap.

That would account for the difference in size of eps, and why the .ps is 50% bigger, but does not directly explain how this should cause a 20x bigger PDF.
I don't know... maybe pstopdf recognized the "colorimage" form and optimised and jpeg'd it?

Now I need to work out how/why texmacs gets the images emitted in different forms... anyone who knows should speak up now.

Sam

On 12/05/11 15:04, Sam Liddicott wrote:


I copied the 279MB .ps file to an ubuntu 10.04 machine and ran pstopdf and it produced a 42MB pdf as before so I don't think pstopdf is to blame.

I then copied over my .tm files (and my .TeXmacs dir) and exported the ..ps and it came only to 99MB instead of 279MB - that's a big difference.
So TeXmacs on my Ubuntu 11.04 produces enormous .ps files

I copied the 99MB .ps back to my 11.04 machine and ran ps2pdf and it produces a 2.6MB .pdf file instead of a 42MB file.

So I need to find out what texmacs does to export the .ps file to see what it is behaving differently on ubuntu 11.04
Anyone knows? Please tell!

I'm using my own git based build of texmacs.
On 10.04 I'm using TexMacs with top change vdhoeven Fri March 18th "Fix"
On 11.04 I'm using top change vdhoeven Apri 27

I'll now rule that out as a difference, I'm building the version I was using on 11.04. (My 11.04 was an upgrade from 10.10, not a re-install so I don't think it's a case of different code paths because of missing libs).

I've reverted my TeXmacs and it makes no difference.

I'm looking at the way TeXmacs invokes gs when exporting the .ps file, it calls like this:

gs -dQUIET -dNOPAUSE -dBATCH -dSAFER -sDEVICE=epswrite -dEPSCrop

I guess for each of the image files I use. This must be producing different files on Ubuntu 11.04

Sam



On 12/05/11 13:35, Sam Liddicott wrote:


Hmmmm... I tried viewing my PDF on windows in foxit pdf view and it's great!

Evince under ubuntu is rubbish and shows all the half-tone style dithering and the rest.

However I can't put all the blame there, I have a PDF of a document this one was derived from, using many of the same images and it still displays fine, and from the artifacts when zooming closely, must be using JPEG (and it's a much smaller file).

So I'm guessing that the PS/PDF internal image format I'm now getting from TeXmacs is different such that
1. images are larger
2. when converted to pdf, display badly in evince pdf viewer (but as PS display fine).

I don't have any record of how big the .ps files used to be.

I'll try and find an ubuntu 10.10 machine I can run pstopdf on and see if that produces PDF of a normal size.

Sam






On 12/05/11 11:53, Sam Liddicott wrote:


On 11/05/11 19:39, Sam Liddicott wrote:
I have a 48 page TeXmacs document, and many of which have images which are screen-shots and so not very high resolution - I think all are less than 800 pixels wide.

The png image source directory and .tm file come to under 9MB, and the document uses only about half of those images.

The exported post-script is 280MB and the PDF is 40MB - which are horrific sizes.

(I'm using latest git repository from git://gitorious.org/texmacs/texmacs.git)

I tried compiling with --enable-pdf-renderer but then pdf export fails with:


** WARNING ** Failed to load AGL file "pdfglyphlist.txt"...
** WARNING ** Failed to load AGL file "glyphlist.txt"...
/home/sam/.TeXmacs/system/tmp/tmp_2009563087.pdf
/home/sam/.TeXmacs/system/tmp/tmp_514256944.pdf
** ERROR ** TFM: Invalid TFM ID: -1

Does anyone have any tips on reducing the PDF size? I tried changing from 600dpi to 300dpi but it only saved 2MB on the PDF

I'm quite certain from the PDF view (where the screen shots are a bit washed out and have a funny interference pattern) that the images are being scaled up and even screened in some way, which is why the post script is so big.


Changing the dpi for the printer settings affected the relative size of the images in the document :-( so I had to change back.

Further examination lays the blame with pstopdf, although I guess a png decoder could have been embedded in the post-script which would have kept the .ps file small - and myabe the PDF small too, although I still have to investigate what the PDF is doing.

This link talks about png decoders in post-script.
http://www.tek-tips.com/viewthread.cfm?qid=1050035&page=7

Here is a slice of one of my images:
http://mail.liddicott.com/blotchy-2-orig.png

It looks just as fine in the post-script view.

And here is a screenshot from my PDF viewer at 400%
http://mail.liddicott.com/blotchy-2.png

The areas of solid 24 bit colour have been dot-ified, some kind of hatching or other dithering it seems.

The change I observe may be different defaults for pstopdf as I have just upgraded my ubuntu release.

Sam








--
[FSF Associate Member #2325] <http://www.fsf.org/register_form?referrer=2325>

<http://www.openrightsgroup.org/>



Archive powered by MHonArc 2.6.19.

Top of page