Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations strongm on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Web Conversion Nightmare!!!

Status
Not open for further replies.

eugenetyson

Technical User
Aug 21, 2007
1,066
IE
Can anyone offer a solution to this? Please fire any questions you want at me, I'll try to answer them to the best of my ability. A little on my background, I'm a fully qualified graphic designer, with a solid know-how for printing and all programs that are Adobe, with a Quark background.

I work in a publications department for a largish company. We are currently making our files available for the Web. Problem is we use InDesign as our Design and Typesetting package. Which is what we should be using, concentrating mainly on books. It's a program designed for printing on paper, not publishing on the web.

The company we hired to build the search engine for our books are useless in my opinion. Basically we are exporting all the files to RTF, then stripping out the images from the file and placing them into the RTF for them, they could be vector art drawn inside InDesign for ease of preparation to print without external programmes.

My thing is though, we can just export the PDF to RTF and all the images are there, but PDFs are problematic at best as they lose a lot of details, like Character and Paragraph Styles when converted to PDF. So when you export to PDF they come up with Style Names like CM+147. Which doesn't work with their CSS.

The company we have hired to do the conversion won't use the RTFs from the PDF, understandably because they can't make the CSS work. But at the end of the day we are the ones either left converting all the files to RTF and placing the images in, and we are the ones that have to convert all the styles for the web. We don't want to, we want to give someone our InDesign files or PDF and make them ready for the web, with an all inclusive Search Engine designed specifically for our Website and members to access.

That isn't my job though, we hired this company to put our data on our website, with a built in Search function. And I tried to find some of our publications online today and it was horrible. I searched for the key words and nothing came up. I went through the archive structure in the left pane and found the document. But other documents matched the keywords, I got hits on them, and I could see they were marked with a tick mark, but the one thing I was looking for wasn't marked.

We used to have PDFs up there but the people who buy our books found it very slow, probably due to their internet connection or computer to open a pdf in their browser, so we wanted a totally "TEXT" file they could search, with Images. You probably ask, if they buy your book, why do they need

Ok I've rambled on enough. Here's my question:

Does anyone know of a company that will take our InDesign files, convert them for the web, construct a good Search Engine for our website?

It doesn't matter if these are separate companys or one company.

How would you proceed with this problem?
 
The problem here is that the native format for web documents isn't RTF, PDF or InDesign. The web serves HTML documents. To convert your InDesign documents to HTML is not going to be an easy thing to automate, in fact I'd say it would be more or less impossible. Certainly impractical.

It sounds to me like they are using the wrong approach to what you want to do and trying to shoehorn what you supply into a system that just doesn't work that way.

Here's one way I might consider doing it.
Create PDFs of the documents. You shouldn't have any problems with this from InDesign. This will create downloadable files.

Now, you need something that will go through the PDF and index its content for search purposes. This isn't too difficult and is certainly possible to do. I won't go into specifics here for now - this isn't the right forum.

The crux is that you have 2 sets of 'data'. Your PDF documents and a searchable index that links to the relevant PDF.

How many documents are you dealing with? How large are they?

Do you know that Google, as an example, indexes PDFs and that you can create your own site only search using Google?


Try asking this in Forum253

--
Tek-Tips Forums is Member Supported. Click Here to donate

<honk>*:O)</honk>

Tyres: Mine's a pint of the black stuff.
Mike: You can't drink a pint of Bovril.


 
Hey thanks for your reply.

I know RTF, PDF and InDesign isn't the right tools/files for doing the job, that's the nightmare.

How many documents are you dealing with? How large are they?

Do you know that Google, as an example, indexes PDFs and that you can create your own site only search using Google?

I'm dealing with about 50 books. Ranging from 100 pages up to 3,200 pages.

We want an electronic searchable database for all our books.

The reason for exporting to RTF is because our Styles in InDesign match their CSS for the Web. But RTFs don't export images, it's a text format. I know about this bit.

It's the whole, I want someone to take my InDesign files, create a searchable database for all of them that is accessible on the Web bit that our members can access them to reference things from one book to another, as they are all sort of interconnected, that I want done.

I'll pass on your suggestion and take it under consideration, but I just need a Convert to Web button in InDesign. I've done all the hard work of getting them typeset, laid out, designed and printed, now I'm being asked to do something I know nothing about and the company we hired are making a meal of it. I know it can be done better, I want people to suggest ways, because I don't know.
 
There's a difference between creating a searchable database of books and actually converting the book layouts to HTML.

If you just want to be able to search and return in which book and where the search term occurs then all you need is text, no images or anything.

If, on the other hand, you need the users to be able to view the books online that's a different matter.
You could keep the pages/books as PDFs and these would load into the users browser (via a PDF viewer) or download to their desktop - this depends on their OS and browser setup.

Do you need the book pages presented online to look the same as those in print? Remember, print and web are very different mediums.

The advantage of this method is that your books will look identical to the printed versions. You can also use PDFs interactive features to create links to other pages/books.
The disadvantage is that you will basically be giving your books away.

Or how about this for another idea:

Create raw text for compiling the search engine database either via outputting text from InDesign or PDF.
Create JPEG images from a PDF page on the fly when someone wants to view the page. ImageMagick and Ghostscript can do this easily.
You could actually create all the pages as JPEGs in advance I suppose. JPEGs will display as you intended in the browser without loading a PDF viewer or forcing a file download

The key is that you don't search the book/pages directly. You search a separate database of indexed text that will link to the appropriate page image.

Thinking about it, Google Booksearch uses scanned images of books.

The disadvantage here though is that you cannot hyperlink text within the images (unless using image maps - but that will be hard to do automatically)


Either way, it's not a content management system issue. Try the web design forum or InDesign forum (if you haven't already done so). Though in my opinion any 'convert to web button' will make a hash of things (I speak from past experience here).

--
Tek-Tips Forums is Member Supported. Click Here to donate

<honk>*:O)</honk>

Tyres: Mine's a pint of the black stuff.
Mike: You can't drink a pint of Bovril.


 
I know that the conversion to HTML needs to be done. I'm fully aware of that.

I've already said I don't PDF's for various reasons. I've said that I want someone to convert my indesign files for the web, and to create a searchable database.

It's not an InDesign problem, as it's not intended for the web and it's not a Web Design problem as it's not a design problem for the Web. As I'm sure the web designers use their tools and are unfamilar with mine and vice versa.

I fail to see how this isn't a Content Management issue. I have content that I want managed on the web. I'm looking for advice, not to be pushed around the different forums.

I'm not an idiot, I know about design, web and the different packages. I know the problems involved. I just don't want the hassle.

If anyone else has any ideas please do tell.

If it is seemed appropriate to move to another thread the do so, just leave me a link so I can find it. Thanks.

I know InDesign has XML, I don't know how to use it well enough to make it viable for the web. Anyone got any experience?

Ps. I am posting this problem on as many forums as I can in the hope of an answer.

Thanks for all your help though, it's good to know that someone else has the same views on it as I do.
 
Yes, this is web content that you want managed.
But this particular forum is concerned with CMS.

What you seem to be looking for is a simple way to convert a bunch of InDesign documents to workable, linked HTML pages. I see that as being slightly different.

Sorry if you feel you are being shunted to different forums but this particular one isn't heavily used. You will get a faster response and reach a wider range of people in the Web design forum or even the XHTML/CSS forum.

Please don't infer that I think you are in any way an idiot. But likewise please don't assume I'm not trying to help you with your problem nor that I know what I'm talking about.

I've been in the creative/publishing industry for 15+ years.
Spending 6 years working in print/repro. I then moved on to solely design and artwork for print and ran a studio for a while.
For the last 6 years I've been working on websites and intranets and have been responsible for producing many sites. I specialise in Standards compliant XHTML and CSS based sites with Javascript, PHP and MySQL thrown in for good measure.
I have 'rolled my own' content management systems and I'm currently taking a 10 minute break from coding some extra features into a digital asset management system (which I did think may be of some help to you actually).

I also still do a bit of artwork now and again and am quite proficient with InDesign, Xpress, Photoshop, Illustrator etc.

Apologies for the mini-biog but as you can see I've sat on both sides of the fence as it were and do kind of know where I'm coming from here.

Sorry to hear you don't want the hassle of converting your InDesign artwork to useable HTML. Sadly it sounds like that is what you need to do, like it or not. There is no good, quick and efficient way to export directly from a Page layout app like InDesign to *good* HTML. You can do it with Quark, you can do it via Acrobat's HTML export, you can probably get a plugin for InDesign too come to think of it but with all my experience I've never seen anything but the very simplest page come out right. Plus the code produced is normally very bloated.


XML output from InDesign is a little sketchy but theoretically you could produce XML files - but they aren't going to look like your book pages without the use of XSLT and XSL to transform them into properly styled web pages.
It might be an option though. You obviously know that already though.

I've already given you a link to the web design forum but here you are again.

Forum253
Forum215

Good luck.

--
Tek-Tips Forums is Member Supported. Click Here to donate

<honk>*:O)</honk>

Tyres: Mine's a pint of the black stuff.
Mike: You can't drink a pint of Bovril.


 
I actually used Google for you and found something that might be of some help.




The 3rd party product they mention might be of some help. But I still don't think that you are approaching this the right way.

--
Tek-Tips Forums is Member Supported. Click Here to donate

<honk>*:O)</honk>

Tyres: Mine's a pint of the black stuff.
Mike: You can't drink a pint of Bovril.
 
#
Eugene said:
April 12th, 2007 at 4:18 pm

I would have liked to see InDesign CS3 make html pages from the document, maintaining the layout. It would have been nice. I have 30 books to publish this year and they’re all going online, each layed out differently. It’s going to be nightmareish to sort this out. PDF is not an option for our online as members complained about PDF’s before, through ignorance more than anything. It’s a long story. I want a button that says, export to Web, like there is for Export to PDF. Is the packaging anybetter I wonder, to Dreamweaver??? I liked the go live one but it didn’t do a great job on it and clean up was a little difficult on a mass scale.

Been thinking on this since April then?




--
Tek-Tips Forums is Member Supported. Click Here to donate

<honk>*:O)</honk>

Tyres: Mine's a pint of the black stuff.
Mike: You can't drink a pint of Bovril.


 
Where did you pull that from? Yes I have been thinking about it and converting from InDesign to RTF to HTML, it's a slow process and now I have 20 more titles added to my list.

Ah well, the search goes on. I'm sure I'll still be thinking of this next year too.
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top