mailing-list for TeXmacs Users

Text archives Help


[PATCH] Re: [TeXmacs] Large document size


Chronological Thread 
  • From: "Sam Liddicott" <address@hidden>
  • To: address@hidden, address@hidden
  • Subject: [PATCH] Re: [TeXmacs] Large document size
  • Date: Fri, 13 May 2011 10:00:09 +0100
  • Envelope-to:
  • Face: iVBORw0KGgoAAAANSUhEUgAAADAAAAAwCAMAAABg3Am1AAAAAXNSR0IArs4c6QAAADxQTFRF NTdQY2Z/i286eHFugnNXoJBwm5GCs5VeoqO0ua+fwLCQ4c2q19C81NDF4eLr/unQ/PTf/fXW+/rq /f/7XKo76wAAAAlwSFlzAAALEwAACxMBAJqcGAAAAi9JREFUSMetloGSqyAMRVsFNAQkxP//15cE 62rFdt/MZqbTVu/hBhLEx/qf8fgDIAN455wHyPk7wCIe9nA+fwGSH87hgD8AHPfRpy2GwdMtcBh+ mg4Ech9Ih+RNixJKDLEL0GF8VLkSojePHrDlryNijbFW8zBg8PUKbBOI6D1WCV5Rf2DzxCtgBlMs tRRVW5LyVW3ew1TfAZZ6RcQicpb4yZmaC74DJADK6BdATOrUAbJzsZie+K22ZnEBknNYysvhVH6u 1X8CqAe4DuANoPoOrAbsHXUARJ5knc5z4EyyTnHoAqj7xvkYIdk65AwhyEZKMusOIDV2QQQaCnDQ gBiDR3RdIEb5CBESt2wgQJI0Yx/Agphg1MgNsD+QkszM9wHVPx/PeQOW+fmcR9DF8+4CyCJhCuP8 fDzmBbLqs8JzMODqYNfDvKhoGYM2lDosy5jlRozccRAgrPM8iygQS3Mv87IugUgKlC7NZ6liuy4r uresVq5QuQAr09ZKEga8OkluyKXeQ6Bs3SrtqgC1pWLNqAfwDhBnB+sK8OrVL4D4h1B4zWPWhJhu gPUHKA5EwSMYUMsdQK+UKDhIOTSHeg+8HCoTOGtB/gwcl4mSdGybwXegPWhM/RHYKwchkckhUNMX 6j7ueQNScLbZZOPRxaADyO2UQCJpl7wbnE6gQ+ksWjpngxOwF1tF5ecP3R6KB+AQxPfHLncQ+nKw 06fhu68OTIfJ869eTtg22l+9zfwDK3mKl5BFHMYAAAAASUVORK5CYII=


If I disable the use of gs to "condition" the eps files written by qt_image_to_eps then the resulting .ps file seems to not be conformant:
1. gv can only display page 1
2. evince can only display some page in the middle.
3. basic old gs can display all pages (althought of course it requires me to press enter after each page).

I guess this is why gs was being used to condition these .eps images.

I note that when ps2ps invoked on the non-conformant output ps file, it reduces it from 158MB to 9MB and causes it to work just fine in evince and so on.

Also, ps2pdf when invoked on the non-conformant output ps file produces a 2.5MB pdf file that also seems to work just fine.

Primary conclusions
1. The .eps written by qt_image_to_eps is not satisfactory.
2. The .eps written by gs is not satisfactory.
3. The .eps written by the imagemagick "convert" utility is satisfactory.

Secondary conclusions
1. ps2ps might be invoked to some advantage on the output .ps in all cases.
2. qt_image_to_eps might do well to copy the format produced by imagemagick or that produced by gs version 8 when processing the imagemagick format.
3. gs 9 tools should not be used to output intermediate post-script files

Actions - patch provided, tested on Ubuntu 11.04 - requires imagemagick to be present
1. In image_to_eps
a. Disable qt_image_to_eps
b. Disable gs_to_eps

Future actions:
1. Fix qt_image_to_eps to produce conformant eps files directly instead of using gs to sanitize
2. Maybe the gs eps2write target instead of the epswrite target when it exists (the ps2write target doesn't work).
3. Consider invoking ps2ps on the generated ps to condition and trim the resultant .ps file. This can reduce a .ps file from 167MB to 10MB

Sam

On 12/05/11 18:05, Sam Liddicott wrote:


On 12/05/11 17:02, marc lalaude-labayle wrote:
Hi,

on a fresh ubuntu 11.04 install, with official txmacs package, i too get huge .ps files.

I've asked for advice on the gs mailing list but my own searches bring up this gs bug report that seems to describe the problem:
http://bugs.ghostscript.com/show_bug.cgi?id=691914

Which is pretty much this:

eps2eps and gs etc should not be used to "optimize" bitmaps for inclusion; as by the time it has finished it may no longer be a bitmap but rather a load of fill/stroke rectangle commands and the colours may already be altered too with respect to ICC colour profiles and all that...

ps2write is recommended but does not yet produce eps conformant output.

The answer is: texmacs should no longer use gs_to_eps on eps files.
gs_to_eps should be optimised to do a straight copy if the input file is also an eps file, and perhaps more correctly, it should not be called from qt_image_to_eps that knows full well the input file is eps.

I hope to submit a patch for texmacs tomorrow,

Sam


Marc

2011/5/12 Sam Liddicott <address@hidden <mailto:address@hidden>>


All lies! qt_image_to_eps is not a qt function but a texmacs
function - and further, it is responsible for the "colorimage"
format I was so praising of on my 10.04 system.

However disabling use of qt_image_to_eps did stop the problems, so
I only have a short code path to investigate to find the true cause.

Sam


On 12/05/11 16:38, Sam Liddicott wrote:



The problem seems to be that since I re-build TeXmacs under
Ubuntu 11.04, qt_image_to_eps is being used to generate the
eps in image_to_eps() file image_files.cpp

I disabled that particular function and now the "convert"
program from imagemagick is being used, so it is interesting
to note that USE_GS is not defined.

Convert is better than QT and produces 158MB of .ps compared
to gs's 200MB and QT's 280MB

This 158MB .ps shrinks down to 2.6MB

So the culprit is qt_image_to_eps() on Ubuntu 11.04 doing
something too clever that results in enormous files which
often render badly.

I'm also willing to swear in a very general sort of way that I
don't think that -dPDFSETTINGS=/printer or
-dPDFSETTINGS=/screen made any difference to final pdf from
the .ps generated with qt_image_to_eps() - but when I take
away qt_image_to_eps() the -dPDFSETTINGS makes a difference to
the size of the pdf and quality of embedded images.

I propose quite strongly that qt_image_to_eps() not be used.

Sam


On 12/05/11 16:00, Sam Liddicott wrote:



With the help of this script running in my
.TeXmacs/system/tmp folder:
while : ; do ln *eps x/ ; ls -l x ; sleep 0.1 ; done

I've got a load of the .eps files.

I note that the eps files fed to gs on Ubuntu 11.04 are
usually larger than those fed to gs on 10.04, and have a
different form.

Those on 11.04 have the bitmap defined the with loads of
lines like this:
138 140 r5
7420 5980 10 10 rf
142 148 r5
7430 5980 10 10 rf
142 140 r5
7440 5980 10 10 rf
142 148 r5

Those on 10.04 have the bitmap defined like this:
colorimage
[B?XWa2ZEF`...
.....

which appears to be perhaps a based64 style encoding of
the bitmap.

That would account for the difference in size of eps, and
why the .ps is 50% bigger, but does not directly explain
how this should cause a 20x bigger PDF.
I don't know... maybe pstopdf recognized the "colorimage"
form and optimised and jpeg'd it?

Now I need to work out how/why texmacs gets the images
emitted in different forms... anyone who knows should
speak up now.

Sam

On 12/05/11 15:04, Sam Liddicott wrote:



I copied the 279MB .ps file to an ubuntu 10.04 machine
and ran pstopdf and it produced a 42MB pdf as before
so I don't think pstopdf is to blame.

I then copied over my .tm files (and my .TeXmacs dir)
and exported the ..ps and it came only to 99MB instead
of 279MB - that's a big difference.
So TeXmacs on my Ubuntu 11.04 produces enormous .ps files

I copied the 99MB .ps back to my 11.04 machine and ran
ps2pdf and it produces a 2.6MB .pdf file instead of a
42MB file.

So I need to find out what texmacs does to export the
.ps file to see what it is behaving differently on
ubuntu 11.04
Anyone knows? Please tell!

I'm using my own git based build of texmacs.
On 10.04 I'm using TexMacs with top change vdhoeven
Fri March 18th "Fix"
On 11.04 I'm using top change vdhoeven Apri 27

I'll now rule that out as a difference, I'm building
the version I was using on 11.04. (My 11.04 was an
upgrade from 10.10, not a re-install so I don't think
it's a case of different code paths because of missing
libs).

I've reverted my TeXmacs and it makes no difference.

I'm looking at the way TeXmacs invokes gs when
exporting the .ps file, it calls like this:

gs -dQUIET -dNOPAUSE -dBATCH -dSAFER -sDEVICE=epswrite
-dEPSCrop

I guess for each of the image files I use. This must
be producing different files on Ubuntu 11.04

Sam



On 12/05/11 13:35, Sam Liddicott wrote:



Hmmmm... I tried viewing my PDF on windows in
foxit pdf view and it's great!

Evince under ubuntu is rubbish and shows all the
half-tone style dithering and the rest.

However I can't put all the blame there, I have a
PDF of a document this one was derived from, using
many of the same images and it still displays
fine, and from the artifacts when zooming closely,
must be using JPEG (and it's a much smaller file).

So I'm guessing that the PS/PDF internal image
format I'm now getting from TeXmacs is different
such that
1. images are larger
2. when converted to pdf, display badly in evince
pdf viewer (but as PS display fine).

I don't have any record of how big the .ps files
used to be.

I'll try and find an ubuntu 10.10 machine I can
run pstopdf on and see if that produces PDF of a
normal size.

Sam






On 12/05/11 11:53, Sam Liddicott wrote:



On 11/05/11 19:39, Sam Liddicott wrote:

I have a 48 page TeXmacs document, and
many of which have images which are
screen-shots and so not very high
resolution - I think all are less than 800
pixels wide.

The png image source directory and .tm
file come to under 9MB, and the document
uses only about half of those images.

The exported post-script is 280MB and the
PDF is 40MB - which are horrific sizes.

(I'm using latest git repository from
git://gitorious.org/texmacs/texmacs.git
<http://gitorious..org/texmacs/texmacs.git>)

I tried compiling with
--enable-pdf-renderer but then pdf export
fails with:


** WARNING ** Failed to load AGL file
"pdfglyphlist.txt"...
** WARNING ** Failed to load AGL file
"glyphlist.txt"...
/home/sam/.TeXmacs/system/tmp/tmp_2009563087.pdf
/home/sam/.TeXmacs/system/tmp/tmp_514256944.pdf
** ERROR ** TFM: Invalid TFM ID: -1

Does anyone have any tips on reducing the
PDF size? I tried changing from 600dpi to
300dpi but it only saved 2MB on the PDF

I'm quite certain from the PDF view (where
the screen shots are a bit washed out and
have a funny interference pattern) that
the images are being scaled up and even
screened in some way, which is why the
post script is so big.


Changing the dpi for the printer settings
affected the relative size of the images in
the document :-( so I had to change back.

Further examination lays the blame with
pstopdf, although I guess a png decoder could
have been embedded in the post-script which
would have kept the ..ps file small - and
myabe the PDF small too, although I still have
to investigate what the PDF is doing.

This link talks about png decoders in post-script.
http://www.tek-tips.com/viewthread.cfm?qid=1050035&page=7
<http://www.tek-tips.com/viewthread.cfm?qid=1050035&page=7>

Here is a slice of one of my images:
http://mail.liddicott.com/blotchy-2-orig.png

It looks just as fine in the post-script view.

And here is a screenshot from my PDF viewer at
400%
http://mail.liddicott.com/blotchy-2.png

The areas of solid 24 bit colour have been
dot-ified, some kind of hatching or other
dithering it seems.

The change I observe may be different defaults
for pstopdf as I have just upgraded my ubuntu
release.

Sam











-- [FSF Associate Member #2325]
<http://www.fsf.org/register_form?referrer=2325>

<http://www.openrightsgroup.org/>






--
[FSF Associate Member #2325] <http://www.fsf.org/register_form?referrer=2325>

<http://www.openrightsgroup.org/>
From 16078b916bbabc83562ef9c63a679bf4298957a8 Mon Sep 17 00:00:00 2001
From: Sam Liddicott <address@hidden>
Date: Fri, 13 May 2011 09:53:09 +0100
Subject: [PATCH] Disable qt_image_to_eps and gs_to_eps in image_to_eps

This patch is required when ghost-script 9 is installed.

qt_image_to_eps is broken and results in non-conformant .ps exports
unless gs_to_eps is used.

gs_to_eps is broken in gs 9 by design.
   see: http://bugs.ghostscript.com/show_bug.cgi?id=691914

The solution would be to use the eps2write device instead of epswrite
device, however that device isn;t implemented yet and qt_image_to_eps
should produce decent .eps files like imagemagick's "convert" does.

This patch forces fall-back to imagemagick's convert.

An alternative would be to move the convert script to the top of
image_to_eps so that those without imagemagick would at least
get something but as that results in 20x size .pdf files
I think that is a bad idea.

Signed-off-by: Sam Liddicott <address@hidden>
---
 src/src/System/Files/image_files.cpp  |    4 ++--
 1 files changed, 2 insertions(+), 2 deletions(-)
 mode change 100644 => 100755 src/plugins/asymptote/bin/perl-tm_asy

diff --git a/src/plugins/asymptote/bin/perl-tm_asy b/src/plugins/asymptote/bin/perl-tm_asy
old mode 100644
new mode 100755
diff --git a/src/src/System/Files/image_files.cpp b/src/src/System/Files/image_files.cpp
index 5ff257e..0640469 100644
--- a/src/src/System/Files/image_files.cpp
+++ b/src/src/System/Files/image_files.cpp
@@ -269,13 +269,13 @@ image_to_eps (url image, url eps, int w_pt, int h_pt, int dpi) {
 /*  if ((suffix (eps) != "eps") && (suffix (eps) != "ps")) {
     cerr << "TeXmacs] warning: " << concretize (eps) << " has no .eps or .ps suffix\n";
   }*/
-#ifdef QTTEXMACS
+#ifdef QTTEXMACS_when_it_produces_conformat_eps_files
   if (qt_supports (image)) {
     qt_image_to_eps (image, eps, w_pt, h_pt, dpi);
     return;
   }
 #endif
-#ifdef USE_GS
+#ifdef USE_GS_when_it_preserves_the_bitmap_nature
   if (gs_supports (image)) {
     gs_to_eps (image, eps);
     return;
-- 
1.7.4.1




Archive powered by MHonArc 2.6.19.

Top of page