Categories
Computing

The One Laptop per Child Dream

Do children need game consoles, mp3-players, camera phones, bedroom TV sets with inbuilt DVD players? Probably not. Few cross-cultural comparisons would suggest such devices are of any educational benefit. Indeed they distract children from other forms of play and learning, bombard them with a never-ending blur of junk information and prepare them only for a world of instant gratification, in which corporate deities impart magical gifts. You will not learn how to program by playing moronic games on your PSP or iPhone. But technology is not necessarily bad. Many new technologies first used by the educated elite tend to emancipate rather than enslave users to consumerist addiction. Clearly using the Internet as a research and communication tool rather than as just another channel for advertising, gambling, porn and interactive TV can level the playing field between the intellectual haves and have-nots. So could information technology ever reach out to the hundreds of millions of kids thus far spared of the psychological side effects of mass-marketed gadgetry because they lack landline telephony and reliable mains electricity? Remember the good old days when geeks would learn by writing programs in Basic at the command line and progress to C++ on college work stations? Remember the early years of the World Wide Web when a high proportion of Web sites were handcoded with little regard for eye candy, but merely for the effective and structured delivery of hyperlinked information? Today kids in the prosperous world may learn mouse and gamepad manipulation early on, but few are motivated to look under the bonnet, as long as they can download music, play games and copy and paste text and images into their homework.

Recently I splashed out over £750 (that's approx. €1150 or US $1400) on a reliable laptop, a MacBook, because my current earnings and professional needs can justify such lavishness. For many, a laptop is little more than an Internet tablet with a spell-checking notepad and a few simple games. Most of the machine's memory is used for visual desktop wizardry which quite frankly is a huge overkill. If we remove the cost of proprietary software, a bog standard new laptop, say with a 1.6GHz CPU, 512MB RAM and a 40GB hard-drive, DVD/CDR drive and integrated wireless receiver can be had for as little as £250. If we strip out optical drives and replace the hard drive with compact flash memory, now as cheap as £2 per gigabyte, we can significantly reduce power consumption. By further lowering specifications and optimising software to deliver essential Web connectivity, browsing, word processing, number-crunching and programming functionality we may soon have the £60 (€90 or US $100) laptop, complete with wind-up power generator. Think of it as PSP or Nintendo DS with a keyboard, but without the distraction of moronic games. For children accustomed to wide-screen plasma TVs and game consoles at home such a device would fail to impress. While many technophiles may be salivating over Apple's forthcoming iPhone, replete with smudges all over touch screen and with only Apple-approved software (It may use a variant of Unix-based OS X, but will not let you install additional software), the real battle to break the quasi-monopoloy of proprietary computer vendors and produce a tool that will not only bridge the digital divide, but may reverse the intellectual divide, giving the poor educational tools and leaving high-tech hedonism for sheepish consumers.

Guess what operating system the proposed laptop runs? Linux of course with Firefox and OpenOffice enabling users to access most Web sites and exchanges files in the commonest formats (PDF, ODF and even MS Word 97-2003), but alas no i-Tunes or World of Warcraft compatibility. No wonder neither Bill Gates nor Steve Jobs support this initiative. Their focus is solely on their stock prices.

Categories
Computing

Buying a Mac

For the last 15 months I've been using a second-hand laptop purchased on ebay for £200 with a 1.3 GHz Athlon processor, 256MB RAM and a 20GB hard drive. I first installed Mandriva Linux 10.0 and owing to persistent power management problems I later tried the Ubuntu/Debian-based Simply Mepis distribution. With a little tinkering I managed to get all the essential software packages and Wi-Fi connectivity up and running. Yes, Linux offers a plethora of office productivity, programming and Web development, graphic design and photo editing programmes. Offerings such as Open Office 2.1 for word processing, spreadsheets, diagrams, presentations and databases compatible with Microsoft-centric formats, the Gimp 2.2 for photo editing, Bluefish for web development, KMail for e-mail and Inkscape for vector drawing should meet all but the most demanding or fastidious needs. Although the second distro had superior power management, the persistent burring of the fan, short battery life and poor design of the underlying hardware with a vent on the underside, i.e. problems that would occur with any operating system, prompted me to aspire to better. Should I get the latest 15" Dell laptop for just £499 pre-installed with Windows XP Home and then install Linux as a secondary operating system or should I go for a Lenovo laptop now available from Linux Emporium. for £599 pre-installed with Ubuntu 6.0.6 but lacking support for the machine's inbuilt camera and card reader? But as an IT contractor I can't afford to have a machine prone to failure, overheating, viruses (plaguing mainly Windows-laden machines) and providing problematic interoperability with networks I may have to hook up with at work (still an issue with Linux), but I could not bring myself to buy a top of the range model pre-installed with Windows, although owing to its inescapable pervasiveness this is the operating system I've used most over the last decade. I'd learn nothing new and be divorced from the Unix world relying tools like Putty to gain ssh (secure shell) access to remote Linux servers.

Instead extra earnings over the last year and a sense of inferiority led me to buy my first Mac. I shunned Macs many years ago because of their prohibitive cost and the limited availability of freeware and pirated software. Why buy a Mac and then get Microsoft Office for the Mac to interoperate with everyone else when you could easily install the same software for free on Windows, albeit illegally. Now with the advent of product activation, viable open-source alternatives and the emergence of Web 2.0 applications, enabling us to do most of our work via a browser, all that has changed. NeoOffice works a treat on the Mac, starting in 10 seconds on a MacBook Basic with a 2GHz Dual Core 2 processor, 1GB RAM. TextEdit will display those bloated Word attachments in seconds and the XAMPP suite lets me develop and test your PHP and Python applications with Apache and MySQL 5. X 11 and Fink let me install most open-source Unix/Linux software, while watching movies, listening to music or editing photographs in i-Photo or the Gimp. Should I ever aspire to producing professional Flash applications or refined photo-imaging, then the Adobe Creative Suite is available for the Mac too. After using Linux at home for two years (I have a desktop machine with Linux as well), the automation, responsiveness and silence of the latest MacBooks impress. I had read reports of overheating MacBooks, though unlike some Dell models none have actually caught fire. Yet in a week of solid use, mine has remained refreshingly cool and the relatively quiet fan only runs for very brief interludes, if at all. So if you can afford the extra £200 to £300 price mark-up and do not want any hassle with drivers (an issue that still plagues Linux users despite huge advances and the wonders of Synaptic in Ubuntu), anti-virus tools (mainly an issue for Windows users) etc, then a MacBook is a fairly safe choice and initial testing with friends and relatives would indicate that people adapt fast to the Mac way of life. I still use a 2-button mouse with a scroll wheel and occasionally confuse the command, option and control keys, but for now this MacBook indulgence will be my primary workhorse, but I will continue to keep a keen eye on developments in the Linux world and use many of the same open source apps in X11 (Bluefish, Scribus, The Gimp to name but a few), the only viable OS for the proposed $100 laptop.

Categories
Computing

Where OpenOffice could do better

Some may argue that you cannot fault the quality or features of free software, so if MS Office still offers a few bells and whistles that are a tad more polished than their OOo counterparts, a potential user need simply decide if these refinements are worth £200+. Yet OOo Writer has emerged not only as the main competitor to MS Word, but as a serious challenge to the domination of one software giant, 40% of whose revenue stem from business's addiction to its ubiquitous Office Suite. In many respects OOo Writer is more powerful than MS Word with full support for master documents and structural markup, enabling encyclopaedia-length tomes with thousands of pages to be fully indexed. What a shame that just a few extra refinements stand in the way of making OOo a true MS Office replacement for professional writers working with others locked into the MS Office paradigm. Yes one can use a good text editor and layout engine such as VI and TeX, but in the real world you need to return an edited MS Word document in exactly the same format complete with comments. I hope in the near future wider adoption of ODF and further development of online writing tools such as Writely will make MS Word a thing of the past, but for the time being any alternatives must enable seamless collaboration with users of existing de facto standards.

  • User-friendly comment-editing facility so writers collaborating on a project can easily add explanations for amendments and view comments inserted by others. Currently text entered in the notes dialogue box will not word-wrap, so lines longer than 60 or characters must be horizontally scrolled.
  • Tabbed interface: This has now become standard in the new generation of web browsers and could easily be implemented to emphasise the tight integration between components of the open office suite and discourage users from quitting the whole application just to close one document and open another.
  • Faster start-up: This has admittedly improved with version 2.0.3., but even with preloaders Open Office still lacks the virtual instant access that the MS Office suite provides.
  • Dashed and dotted table cell and frame borders. Why has this option been omitted? Currently the only way to achieve this effect is to manually add dashed borders from the Draw menu.
  • Simple command to cycle through upper case first (or Title in case), all upper case and all lower case, a handy features when editing documents with incorrect or inconsistent capitalisation.
  • Full support for SVG imports. With its growing use in Wikipedia and support integrated into Firefox 1.5+, SVG is fast becoming the standard vector graphics exchange format. Additionally SVG should be used for all clipart and custom widgets. The custom bullets, icons and borders supplied with the OOo are really quite naff and pixelate if enlarged.
Categories
Computing

What is Open Source and why should you care?

Operating systems and productivity software are very much here with us to stay. The millions of person hours invested in the development of the powerful programmes many of us use every day will serve hundreds of millions of users for generations to come. Only five years ago a typical desktop system may have cost as much as £1000, so possibly investing £200 in software may have seemed, relatively speaking, a small price to pay for professionals relying on mission-critical applications. However, when hardware prices plummet as low as £200 for a basic system and software becomes as ubiquitous as word processors, spreadsheets, databases, drawing and photo editing applications, it becomes harder to justify such price markups and with the expansion of the Internet practically impossible to safeguard proprietary source code. Sooner or later someone will find a way of pirating pervasive productivity applications. However, a huge lobby of software multinationals, principally Microsoft, but also Apple, Adobe and to a lesser extent Oracle and Sun Microsystems, has vociferously promoted the concept of intellectual property, e.g. if you write a book in Microsoft Word, Microsoft owns the copyright for the format in which you saved it. Some may argue the last three companies may have belatedly embraced open source out of sheer opportunism to beat one well-known leviathan.

As millions of users simply take the omnipresence of leading proprietary packages for granted, many myths about Open Source software abound.

  1. Open source is bad for software developers: Open source software is not necessarily free, but its source code and file formats are made publicly available so that other programmers can improve on and interoperate with the application. Usually you pay for support, but may download the software for free.
  2. Pirated software is against the law: Open source is 100% legal, unless you infringe the terms of the licence (GPL = General Public Licence or GNU GPL), which usually means reselling the software. Do not confuse legal open-source software with pirated closed source programs. To some extent the major players tolerate some degree of pirating in order to get the public hooked on their proprietary formats and reap huge rewards from commercial users legally obliged to have licensed software. Open-source simply has different kind of licence.
  3. Open source is a bug-ridden and virus-prone: Very few viruses have ever originated from open source programs. Viruses tend to come as e-mail attachments or are planted on your computer via decptively marketed malware such as Winfixer. The leading open-source projects Mozilla Firefox, OpenOffice, The GIMP and Clam Anti-Virus (all available for Windows and Linux) will not leave you with an unworkable machine. Compare and contrast this reality with numerous trojan horse programs that install themselves due to inherent weaknesses in leading proprietary applications.
  4. My employer won't allow it: IT departments that restrict use and installation of open source software may offer excellent excuses for doing so, but ultimately they serve the interests of large multinationals who make a fortune out of their quasi-monopolies. Why should tax-payers subsidise Microsoft so that social workers and teachers can use Microsoft Office instead of a perfectly functional and user-friendly, but much cheaper, alternative like OpenOffice?

Now let's tackle the issue from a different perspective. Do we pay the Prussian inventors of the printing press copyright or licensing fees over six hundred years after their revolutionary innovation? Of course we don't, but we still use technology that has been gradually adapted by different engineers in different places over the centuries. If I use a Hewlett Packard Laserjet printer to reproduce my scripts on paper, HP involvement begins and ends in the transfer of information from an electronic to a paper format. If the HP printer breaks down, I could just buy a different make of printer to do the same job. If I wish to reproduce the same text as HTML, PDF or Microsoft Word 97/2000 (soon to be superseded incidentally), I should surely be able to choose to use the most convenient and cost-effective tool and should not have to buy constant software upgrades to interoperate with others who use these formats. HTML has always been an open standard. I may handcode it in any rudimentary text editor or use a wide array of free open-source programmes to generate HTML for me. Like Postscript, PDF started life as a proprietary industry standard, but Adobe has since open-sourced it. Linux and Mac systems will let you output any text or graphical output as PDF. If I want polished professional output I may choose to invest in Adobe Illustrator or InDesign, or I may go with Corel Draw or Xara Xtreme. Even OpenOffice Draw will output perfectly adequate PDF. So why do I need to donate some of my hard-earned cash to one multinational just to ensure complete compabitility with the latest incarnation of its flagship office suite whose admittedly pervasive formats have no inherent benefits over open-source alternatives?

The OpenDocument Format is the result of collaboration between major players in the software and information business. Its committee includes not just IBM, Sun, Corel and Oracle, but Microsoft too. The specifications of the human-readable XML-based standard are in the public domain. Anyone may implement and support it. Some analysts claim that the principles behind ODF are fine, but in the real world ODF will suffer the fate of Esperanto faced with competition from English. ODF resembles (but predates) Microsoft's new Open-XML (note the confusing name) much more closely than the old Word 97/200/2003 format, but it is this latter standard to which the business world is addicted. This offers advocates of open standards a huge window of opportunity.

Format Lock-in

At times it seems that those who would stand to benefit most from open source software are those who are least aware or worst-informed about it. Many professional writers have used freely downloadable OpenOffice Writer for 600+ page books, replete with a table of contents, footnotes, indices, bibliography and neatly formatted to pre-press standards. Many successfully collaborate with other writers using rival proprietary software such as, err, Microsoft Word. Yet your run-of-the-mill desktop user with much more modest needs feels obliged to purchase the latter application, allegedly to ensure compatibility with documents that others may send. In a rational world lightweight users of word processors would be content with an application bundled with their pre-installed operating system, which ideally should not exceed 10% of the outlay for a new computer. Conversely, someone who writes for a living might actually splash out to acquire the best tools of the trade (there are other heavyweight proprietary tools such as Corel Word Perfect and Adobe Framemaker). When people are forced to buy a proprietary package just to import files created by others or fear that migrating to an alternative package would lead to extra costs in training and user familiarisation, we call this vendor lock-in. Indeed some public sector organisations sign contracts giving them discounts on proprietary software, which already have highly inflated prices, in exchange for a commitment to use this software exclusively for a set period. This is in effect a huge public rip-off.

Open standards, freely usable by any user or vendor.

Proprietary Multi-Vendor Multi-Platform Standards, effectively open-source but with licensing limitations

  • Rich Text Format (.rtf) (MS)
  • Mp3 audio
  • Mpeg 2 and 4 video
  • .wav audio file format (MS)
  • .avi video file format (MS)
  • GIF (theoretically still owned by Compuserve)

Common Proprietary Formats

  • MS Word (widely supported by rivals but with varying degrees of fidelity in the finer aspects of presentational formatting. This format is, however, soon to be superseded by an XML format very similar to OpenDocument.)
  • MS Excel
  • MS Power Point
  • MS Publisher (not supported by major rival products)
  • MS Access
  • Oracle
  • MS SQL Server (sometimes confusingly abbreviated to SQL)
  • MS Windows Media Video
  • Real Media Video
  • Apple QuickTime Video
  • Adobe Photoshop .psd
  • Adobe Illustrator .ai and .eps
  • Adobe Flash (vector graphics, interactive animation)
  • Adobe Shockwave (interactive gaming and e-learning)
  • MS ASP.NET server-side programming with C# or Visual Basic
  • MS IIS with Windows Server 2003

OpenOffice, currently version 2.0.3, offers excellent compatibility with Microsoft Office file formats, but omits support for the MS Publisher format. People use this substandard application (OpenOffice Draw is simply a superior product) because it is bundled with MS Office and widely used in public services. Once familiar with its interface and having mastered a few tricks to get it to yield the desired results, many users are loathe to switch to something else. This means effectively that professional printers need a copy of this program though they'd never dream of using it themselves. Many interopability issues could be solved if in their infinite wisdom Microsoft endowed this program with an integrated PDF export feature (standard on all rival applications such as the freely downloadable Scribus). So rather than waiting for the Redmond Giant to embrace open standards, we should switch to alternatives that do. You should value your Word documents not for their proprietary binary format, but for their actual contents, so opening the same files in OpenOffice and saving them in the OpenDocument Format or PDF will not change your creation, but merely free you of reliance on one software vendor.

Nobody should be compelled to purchase a program or even download a trial copy just to view a file that could easily be reproduced in an open format. Consider HTML. The W3C may regulate it and browser vendors amy implement support for it in slightly different ways, but nobody owns the standard. If I feel a proprietary HTML editor is signifcantly better than an open-source alternative, then I might invest in it, but there are plenty of open-source HTML editors such NVU which will work fine for most users. If you only edit a few holiday snaps, then The GIMP or Google's Picassa will do fine. If you have more sophisticated requirements, then you might just go for Adobe Photoshop, Xara Extreme (now available for Linux too) or PaintShop Pro.

Open Source Alternatives

Microsoft Word
OpenOffice Writer, KWord, AbiWord
Microsoft Excel
OpenOffice Calc, Gnumeric, KSpread
Microsoft Access
OpenOffice Base / MySQL / PostgreSQL
Microsoft Power Point
OpenOffice Impress
Micosoft Visio
OpenOffice Draw, Dia
Micosoft Publisher, Corel Draw
Scribus, OpenOffice Draw. Both excellent alternatives. Draw integrates nicely with OpenOffice, while Scribus provides features that rival Corel Draw and even Adobe InDesign. Sadly neither imports your Old publisher files.
Adobe Framemaker
OpenOffice Writer (yep, it's that powerful)
Adobe Photoshop / Corel Paintshop Pro
The GIMP, Xara Extreme
MSN Messenger / Yahoo Messenger
GAIM
Internet Explorer
Firefox / Sea Monkey / Opera
Outlook Express
Mozilla Thunderbird
Full MS Outlook with scheduler
Gnome/Novell Evolution (fully compatible with MS Outlook Exchange Server) or Thunderbird with Mozilla Sunbird (fine for small offices)
MS Front Page / Adobe Dreamweaver
NVU (all platforms) / Blue Fish, Quanta (Linux only)
Adobe Flash MX Development Application
No real competition for advanced stuff, but try Flash4Linux and OpenOffice Impress will export slide shows to Flash.
Adobe Illustrator, Freehand, Corel Draw
Inkscape, excellent SVG and PDF support but not quite as refined as industry-leading proprietary alternatives
Final Cut Pro, Adobe Premier
Some good projects such as Kino are promising, but as yet if you want professional video editing software, you still need to go proprietary.
Categories
Computing

Why are you using still using Microsoft Internet Explorer?

Screenshot of IE6 after Youtube withdrew support for it.

You may just be wondering why it matters. Microsoft have their operating systems on over 90% of the world's desktops and these just happen to ship with IE as the default browser. Not only that some sites use code specifically designed for Microsoft technologies like Active-X, so why not just take it easy and reap the benefits of interoperability that stem from the dominant browser? Why not just leave Firefox to the geeks and besides "If the guys at Mozilla want to compete they might consider embracing Microsoft-compliant code".

First let's get a few facts straight.

  1. Firefox is simply better: Gecko-based browsers are way ahead of IE6 in their support for XHTML and cascading style sheets. IE7 may have copied tabbed browsing from Opera and Firefox but it hasn't fully embraced CSS 2.0.
  2. Firefox is almost infinitely extensible and customisable: Besides RSS feeds, multiple search bars, multi-coloured tabs with progress bars, image zooms, Wysiwyg HTML editors, you can even have a full integrated FTP client, automate the download of whole sites for offline reading, hundreds of themes or skins to choose from and more.
  3. Firefox blocks unwanted popups: by default Firefox only allows user-activated pop-up windows.
  4. Firefox does not let in viruses: As the browser shields itself from the operating system and does not use Active-X, you can only install viruses if you run a file that has permission to execute on your computer. This is primarily a problem with Windows, but user stupidity can wreak havoc on any system.
  5. Firefox lets you resize all text: Many Websites use absolute font sizes that IE will not resize without additional software. In Firefox just press Ctrl/Cmd with + or – to increase or decrease font size.
  6. Keep everything in one window: With the Tab Mix Plus extension you can easily force all external links to open in a new tab within the same window with tabs spanning multiple rows if necessary. Next time you restart your browser, it will restore your previous browsing session for you.
  7. What about e-mail integration: If you want an integrated mail client to replace Thunderbird (open-source alternative to MS Outlook), you can always install Sea Monkey with an advanced HTML editor and IRC chat client.
  8. Microsoft uses older technology: Only sites still using Active-X (which Microsoft is phasing out in IE7 and Windows Vista incidentally) and non-compliant J-Script as opposed to standard Javascript become unworkable in Firefox. There are no special tricks that can only be achieved in IE, only bad coding is responsible. The most notorious sites that caused problems with Firefox last year included http://www.jobcentreplus.gov.uk and http://www.odeon.co.uk . Both have now been upgraded. At one stage users of Yahoo's Web-mail service could not add rich-text formatting to their emails in Firefox. No longer, Yahoo has made its services Firefox-compatible. As many web designers know, with more sites now relying on style sheets for layout and ditching old in-line markup, you're more likely to encounter quirks in IE than Firefox. Consider this site, the Tux penguin below the menu has a transparent background in Firefox and Safari, but a appears on a white rectangle in IE.
  9. Firefox is free and open source: Anyone can download, install and redistribute it. Not only that the source code is publicly available.

So if you still think Firefox is just for Microsoft-bashing nerds, why on earth would the Redmond Giant want to turn its next version of Internet Explorer into Firefox Light with a legacy Microsoft quirks compatibility mode?

Download, install and try it! If you don't still like it and prefer IE, Safari or Konqueror, simply reset your preferred browser as default. But if you like surfing as much as I do, the only real competition is Opera.

What about Netscape?

Well, it's still around, but in 1999 AOL bought it and shortly after discontinued independent development of the old 4+ code base, promoting its own branded version of Internet Explorer instead. Back then IE had better support for W3C standards, as Netscape included a number of non-standard enhancements such as layer instead of div tags. User numbers dropped rapidly to under 3% until 2000 when Netscape 6.0 appeared with a beta Gecko engine at a time with few CSS-layout sites. Version 7 saw some improvements, but in an era of almost complete IE domination. The latest offering outsourced to Mercurial Communications but distributed by AOL combines Firefox's Gecko engine and IE's Trident engine with a few bells and whistles that you could add to Firefox simply by installing extensions. Moreover, Netscape 8 is only available for Windows XP. Current figures show Firefox with over 10% of browser usage and considerably more in continental Europe with Netscape still under 1% and both Opera and Safari gaining niches between 1% and 3%. However, as of this writing Firefox tends to appeal to heavy users and web developers (just consider the way the Web Developer's tool bar lets you deconstruct and analyse a Web site), so fewer than 10% of computer users have adopted an alternative to IE. Remember the web is yours and not the sole domain of one multinational corporation.

Categories
Computing

Non-Web Formats

The Internet is a collection of interlinked documents distributed in open formats compatible with the greatest number of heterogeneous operating systems and devices. The World Wide Web's standard text markup language is HTML, which has undergone numerous revisions since the Internet's rapid expansion in the early 1990s. XML, in many ways a descendent of the more complex SGML, is the default standard for data exchange between diverse systems. Almost any kind of data can be marked up and accurately described using a specialised dialect of XML. Applications range from MathML for mathematical notation, RSS for news feeds, MusicXML for faithful representation of music, CML for chemistry to SVG for scalable vector graphics. Recently HTML has morphed into XHTML combined with cascading stylesheets to separate style from content and customise formatting for different devices and media. All Web browsers render HTML and most modern browsers reproduce XHTML with CSS reasonably well. More important not only do all Web editors output HTML, but so do most word processors and desktop publishing applications. Besides numerous user-friendly Web tools can be downloaded free of charge to enable almost anyone with access to a computer to produce their own Web pages without learning a single HTML tag.

So why does the Internet abound with PDFs and Microsoft Word documents? Both are proprietary binary formats, although Adobe developed the Portable Document Format to allow cross-platform compatibility and has been keen to allow other vendors to provide a PDF export option. PDFs are admittedly often the only realistic way to reproduce complex formatting on diverse systems. Until Scalable Vector Graphics and CSS3 with multi-column layout are fully implemented in mainstream Web browsers, PDFs will remain the only practical solution for the accurate reproduction of the output of desktop publishing applications via the Web. But surfing the Web is not the same as slowly contemplating a glossy magazine, it's about navigating through a web of hypertext pages to gain fast access to related information.

  • PDFs are nearly always much larger than equivalent HTML pages, sometimes 10 to 20 times larger just to include a few small logos or photographs.
  • Software used to convert word processor and desktop publishing files to PDF (notably Acrobat Distiller combined with MS Word or Publisher) converts most graphics, including custom frames and borders and often non-standard fonts, to bitmaps further boosting file size. In reality only graphics applications like Illustrator, InDesign, FrameMaker, Freehand or Corel Draw can produce polished graphics taking full advantage of PDF's vector graphics rendering capabilities.
  • Embedding fonts further increases file size, by 30 to 40KB per font.
  • PDFs are designed to reproduce formatting, not semantic information and structure.
  • Adobe Acrobat Reader is a memory-intensive application, which even in the era of 3GHz CPUs can take over 45 seconds to load, cause computers to crash or force the user to close the browser in which it loads.
  • PDF files interrupt the general Web experience. Inexperienced users are confronted with a radically different interface without the usual navigation features and even the back button will not work if it loads in a new window.

Most programs used to write Web content can convert text more easily to HTML than PDF. Most notably Microsoft Office applications lack a native PDF conversion capability (you'll need to buy Adobe Acrobat Distiller or Jaws PDF for that feature), but will save to HTML, albeit Microsoft's implementation thereof. OpenOffice and Corel PerfectOffice will let you save any document in both formats and even Adobe InDesign and PageMaker have HTML-export facilities. So if you think PDFs are better, why not let users choose. If they just need information, most will stick to HTML, but if they genuinely wish to view or print the full splendour of your artwork they can wait a few minutes to view your graphics-rich PDF. One should never need to download a PDF just to read a bus timetable, the agenda of a meeting or even a lengthy report. HTML is a much more versatile and lightweight way of distributing textual information. Savvy readers can easily adjust text-size without needing to scroll horizontally or change background colours.

Even worse is the profusion of Microsoft Word documents, especially in public sector, research-oriented and academic sites. In many portals, a site-wide database search returns a list of relevant Word, Excel and PDF documents. To view the actual content you need an application capable of reading these binary formats. At fault is usually the content management system, members of staff are simply allowed to upload documents, so they publish their files in their original format. All too often one reads statements like for the minutes of the last meeting please read minutes67.doc. Although even in the non-Microsoft world one can view 99.9% of Word Documents in OpenOffice, it means starting a memory-intensive application just to load a file, many orders of magnitude larger than an equivalent HTML document, and in the vast majority of cases with no formatting that could not be easily reproduced in standards-compliant XHTML with CSS. More important very few users will need more than the information contained in the document.

Myths and Excuses

Claim 1: HTML documents print badly.
Truth: HTML documents using absolute-sized tables for layout without a print stylesheet print badly, often extending beyond the page width. Sites built with separate print stylesheets can easily reformat a page to hide menus, headers and footers and print only the main body, neatly spanning the printable width of a page. Browsers like Mozilla Firefox can interpret many advanced CSS2 printing properties and let users customise the way a page prints.
Claim 2: PDF Files are accessible:
Truth: Just add 30 sec. to 5 mins to the average download time and factor in the inconvenience of another application starting in the background.
Claim 3: Publisher files can be distributed as PDFs:.
Truth: First you need an extra application such as Adobe Acrobat Distiller to communicate with Microsoft's PostScript driver to do that. Second the results are often very unprofessional with pixelated bitmaps replacing smooth curves. Third the resulting file size is often ginormous even if you select Screen-optimised.
Claim 4: Everyone has Word
Truth: Many users are not using a computer (e.g. Web TV or 3G phones) or have restricted access to external applications e.g. in a library or Internet Café. Word is also a hugely overpriced application and since Word 2000 requires product activation, limiting use to a single machine. Most other Word processors will open MS Word documents, but often tables and textboxes are poorly aligned. Word is simply not a Web format! Indeed Microsoft only belatedly introduced hyperlinks in 1997 (Word 95 required an add-on utility for this functionality).
Claim 5: What about Document Exchange, such as application forms that people need to return in the same format?
First in many cases HTML forms let users with only a Web browser, complete complex and well-structured application forms. Emerging standards like XForms (supported by OpenOffice 2.0) will greatly enhance the ease with which visitors can submit information to and interact with Web sites. Second, any HTML page can be literally cut and pasted into a word processor, edited and saved as HTML. Third, admittedly some complex table and multicolumn structures are still hard to render in HTML, but since the late 1980s we have had a cross-platform word processor format, RTF or Rich Text Format, that will faithfully reproduce all textual content, including tables, indices, headers, footers, as well as embedded graphics with full support for styles. Indeed even MS Word uses RTF to convert to earlier versions of Word. Unless you want to impress employers with WordArt, rtf files will load fine into any word processor. Besides the future lies clearly with XML.
Claim 6: Staff are not trained in HTML!
Truth: Most users of word processors are not trained in ever-changing binary formats like MS Word either, which are conisderably more complex. They simply type, apply headings, bulleted lists, highlight, add a little colour and change fonts. How long does it take to teach someone to select Save As HTML from the file menu? You can even get macros to automate the task completely and insert the HTML output directly into e-mails, rather than annoyingly attaching a bloated Word document.
Claim 7: I need my spell checker!
Truth: These days, spell checkers are integrated into most applications that process text, e-mail clients, desktop publishing suites, word processors and HTML editors. Besides once you've used your favourite word processor to verify your orthography, you can save your document as HTML.
Claim 8: HTML does not support WordArt
Truth: HTML is about conveying information. All headings should be reproduced as text enclosed in heading tags and not as frivolous graphics. Of course you can add fancy text as an additional graphic, but you'll find much better tools than MS Word for that purpose. Currently this means using PNG, GIF or JPG bitmaps, but when major browsers support SVG, many Web pages will begin to resemble the creations of high-end desktop publishing programs, while being simultaneously viewable in text-only mode.

Structural Formatting

Most users of word processors just use icons and drop-down menus to change fonts, sizes and colours or occasionally add bulleted lists and tables. By contrast, strict HTML, and even more so XHTML, insists on structural mark-up. A heading is simply not a line of text with inline formatting to change its appearance. It is an element marked up as a heading. The same goes for other structural elements like paragraphs, lists or tables. The structure tells us how the elements relate. Let us consider two simple examples. First, we wish to generate a table of contents. If we have marked up all headings as a hierarchical sequence of headings and subheadings, many applications (including OpenOffice and even MS Word) will automatically generate a Table of Contents for us. Now consider a search engine trying to make sense of millions of words in thousands of documents on the Web. How does it rank documents responding to the search terms snails and evolution? Clearly thousands of documents will contain both terms, but if these terms appear in identifiable headings they will be ranked much higher. An article containing both in the main heading would be ideal, e.g. The Evolution of Snails, but an article containing evolution in a main heading and snails in a subheading would also rank high. Now suppose a hastily converted article contains evolution of snails in a normal paragraph element, to which the word processor applied inline formatting to make it stand out. The search engine would just ignore the inline formatting and treat it as a normal line of text, thus giving it a much lower ranking. As a rule word processors will only convert lines of text that look like headings into real HTML headings if you use styles. Fortunately, this feature is easily accessible in all leading word processors, though completely ignored by most casual users.

Modern Web sites like to maintain a consistent look and feel with stylesheets. Additional inline formatting added by many leading word processors (most notoriously by Word 2003) not only considerably boosts file size, but overrides the default stylesheet limiting HTML's inherent versatility. In this case, one should save as plain or filtered HTML. With the spread of content management systems for most large Web sites, more and more regular Web editors use HTML editors embedded within a Web page requiring only a Web browser. These may use a Java applet (which most Web browsers support), Javascript, Active-X (supported only by Internet Explorer) or XUL supported only by Gecko-based browsers like Mozilla Firefox. The latest generation of embedded XHTML editors ensures that any formatting applied by the user is automatically converted to standards-compliant code compatible with the web site's stylesheet.

In sum, technology has already rendered proprietary word processor formats obsolete on the Web. They only persist thanks to the domination of one well-known multinational and its grip on corporate, academic and public-sector users.

Categories
Computing

Reclaiming Word

Screenshot of OpenOffice Writer 2.0 running on Mandrake Linux

If you own a computer, you probably have some form of word processor. Whether you need to type a report at work or a letter at home or maybe just a short shopping list, chances are you think you need Word or rather Microsoft Word TM . How could we possibly manage without WordArt, ubiquitous in nursery schools and on church noticeboards worldwide? Don't messages look so dull if left in a dated serif font? Isn't it just wonderful that we can highlight text in bold and change its colour? And just in case we make the odd typo, we've got a spell checker to boot.

Now use a pocket calculator or the free one that comes with your operating system to do some simple maths. Each Microsoft Office licence costs between £90 and £400 depending on the package (Standard, Student, Professional, Business, Developer) and applicable discounts. For sake of argument let's assume an average spend of £150. Now let's just take the population of the prosperous world, around a billion, and assume one in eight (1/4 of the working population) require an MS Office Suite either at home or at work, that's a whopping £18.75 billion straight into the coffers of one leviathan every two to three years just for software that was developed by numerous teams of programmers over the last 30 years. Indeed every single major feature available in Word and Excel was pioneered by other programs such as Word Star, Wordperfect, Lotus 1-2-3 and Corel Draw long before Microsoft took the market by storm in the mid-1990s. Computers may now be much faster with superior graphics and the interface has been jazzed up, but mail merge, spell checking and multicolumn layout have been with us since the late 1980s. The processing power of your average PDA or even mobile phone is greater than that of a 1988 word-processing typewriter complete with a 10" monochrome monitor and a revolutionary 3.5" floppy disk drive. I once calculated that 300 pages saved in WordStar 4 could fit onto one double-density 720KB floppy. We can now store the contents of 80 floppies onto one 128MB flash memory card.

Bloated Word Documents

I recently received the contents of a web page as a word document, all 26 megabytes of it, a long download even in the age of broadband Internet. The resulting web page with around 300 words and 6 pictures occupies around 120KB on the remote server. An extreme example because unedited digital pictures had just been imported into Word and manually resized and aligned. The other day I received an e-mail with a Word attachment advertising a conference, word count 158, character count 1157, byte count in excess of 1,600,000, all for one mediocre logo. Had the information been cut and pasted into an e-mail, it would have added 2KB at most! The document contained no formatting that could not be easily reproduced in any half-decent e-mail programme such as freely downloadable Mozilla Thunderbird.

So what if Microsoft makes a fortune from its monopoly? Isn't Word just the most user-friendly text editor out? You might have guessed it, but I'm typing this rant in OpenOffice 2.0 and as a seasoned MS Word user I've yet find anything that OpenOffice cannot do just as well as its premium-rate competitor.

MS Word's essential features have hardly changed since version 6. Word 97 saw a new file format causing temporary incompatibility with a large pool of Word 6 users. Word 2000 had multiple copy and paste and Word XP has belatedly embraced XML, albeit Microsoft's implementation thereof. But let's face the facts, your average Word user does not know how to use styles, autocorrect, autotext and automated tables of contents, let alone craft advanced XML projects. Yet every single one of these features is now available in OpenOffice Writer and version 2.0 has enhanced MS-Office compatibility.

Millions of documents are formatted day in day out with little more than the dropdown font type and size selectors, bold, italics and underline. Creative users will play around with WordArt, insert an image from the ClipArt library, embed a digital photo or paste in one acquired from the Internet. Advanced users may insert the odd table, add hyperlinks or even spread text over multiple columns. But only a small minority of Word users have more than scratched the surface of the programme's potential and neither should they? If you're not a technical writer, legal secretary, translator or web developer, why should you care if the heading of your report is merely set to Arial size 24 or is actually set to heading 1 (style dropdown)? Now imagine you need to create a table of contents after drafting an 80 page instruction manual with 64 sections, 257 subsections and 2429 footnotes and your boss will probably ask you to make many more post-edits. If you had structured your document with hierarchical headings, the task could be automated and the TOC would automatically update when the page number of a new section changed.

Format Wars

Back in the early 1990s it was customary to specify word processor formats. As a technical translator, I'd often receive files in WordStar, WordPerfect, AmiPro, Word 5 for the Mac as well as Word 2.0 and Word 6.0/95. To this list, we may add the tools used by publishers and graphics professionals such as PageMaker and Quark Express and let's not forget the programmable typesetting language Tex and the more user-friendly LaTex, used by academia and publishers especially in the Unix/Linux world. All had irksome interoperability issues with formatting, accented characters and macros. Not surprisingly many agencies insisted on the cross-platform RTF standard (Rich Text Format). By 1997 Microsoft had for all intents and purposes nixed all serious competitors and used their new-found strength to impose a new de facto standard. Millions of Word 6.0/95 users will recall the compatibility woes they endured with the first batch of Word 97 files. Even after downloading a converter from Microsoft (in the days of 14.4 and 28.8kbps modems), the results were often unreadable or required time-consuming reformatting. It took Microsoft two patches to get its Word 97 to Word 6.0 converter working properly. Indeed it probably took two more years for Word 97 to establish itself as the dominant format. But those who argue that the Word .doc format is here to stay and is essential to collaboration and interoperability have a surprise in store for them. In 2006 Microsoft will in effect ditch their own de facto standard by making the new XML-based format the default save option (in my experience fewer and fewer run-of-the-mill Word users are familiar with features such as "Save As" for converting to other formats). Want to know why? Well why not read Microsoft's official reasons straight from the horse's mouth (Microsoft Office Open XML Formats Overview). Evidently XML-based formats are not only more transparent, but partially corrupted files are much easier to recover, because XML is human-readable and lends itself much better to parsing by third-party programmes. Wow, that's what the guys at OpenOffice have been arguing for years. Both the new OpenDocument and the older SWX formats are XML-based, storing text, style and pictures in separate XML files embedded in one jar-compressed file. Microsoft's new format uses Windows-centric zip-compression, but the essential idea is the same. Word 2000 and 2002 users will be able to download an update to read the new XML-based format, but millions of extant Word 97 users will soon find their product totally unsupported by the Redmond Giant. They could download OpenOffice 1.1.5 or 2.0 beta, both of which support MS-Word XML, but sadly many will be persuaded to part with more hard-earned cash for an MS-branded upgrade.

Bells and Whistles

In 1993 I set about buying my first PC with a windowing graphic user interface. "What software can I install?", I asked the owner of a local shop and added "I'll need a Word processor, and best of all MS Word", as that's what most of my clients, mainly translation agencies, required. "We just use Corel Draw 3", he replied. "But surely that's just for drawing?" I quipped "No, no it's good for flyers and most correspondence with our customers". Corel Draw 4 even had a spell checker and what's more you could stretch and bend text on a machine with little more than four megabytes of RAM. If you never wrote letters spanning more than two to three pages (multi-page text flow is a bit of an issue in Corel Draw), Corel Draw 3 would do you fine. Now you know where Microsoft drew their inspiration for the inclusion of WordArt in their 1995 edition of Word!

Many myths abound about open-source software. All alternatives to MS Word import and export MS Word Documents. OpenOffice even imports WordArt, but relies on FontWorks and a fully integrated drawing application to create fancy text, drawings and charts. Admittedly some incompatibility remains, but this mainly relates to minor aesthetic and alignment quirks, e.g. MS Word tables sometimes extend beyond a page width in OpenOffice, because Word corrects manual resizing, and OpenOffice does not allow dashed table borders because dashed lines were not specified in the cross-platform universal Rich Text Format (the is probably one of the biggest deficiencies in OpenOffice). Most notably in version 1.0 Word VBA Macros associated with a file will not work, but OpenOffice 2.0 lets you selectively enable Word macros and convert them to its native Star Basic. But then again Word macros are a primary source of Windows viruses and few users know how to apply document-specific macros anyway. The main use of macros is to automate common word processing tasks and both OpenOffice and MS Word let you do that. If anything Star Basic is much more versatile than Microsoft's legendary Visual Basic, has copious documentation and should make the transition from one office suite to another relatively painless.

What about Publisher?

There is one conspicuous omission in OpenOffice: a program that imports MS Publisher files. To be honest I don't understand the attraction of Publisher. With only the full MS suite at my disposal (sadly a common occurrence in Microsoft-only offices), I'd find its core product, Word, a much easier option for most desktop publishing and then simply import graphics designed in other applications, but in OpenOffice one can change backgrounds and reformat page layout for different paper sizes effortlessly and besides the best program I know for 4-fold birthday cards is Corel Printhouse. The main problem for OpenOffice users is opening and editing MS Publisher files sent by others. Alas Microsoft's wizards for exporting to HTML are far from perfect especially if you desire high-definition print quality. If used with Acrobat Distiller (another £60), MS Publisher files can be exported to PDF files, although most graphics tend to be converted to bitmaps boosting file size. My best advice would be to kindly ask a Publisher user to save their file as an Enhanced Metafile (.emf) and then OpenOffice Draw will import it page by page more or less intact and let you edit and save the resulting multi-page document as PDF and voilà. However, if you demand professional results from your publications, then I'd consider either Corel Draw or, for larger outputs and budgets, Adobe InDesign or PageMaker. The latter will even import Publisher Files, produced by amateurs as no professionals worth their salt would use such a graphically challenged application.

The Power Point Paradigm

The features offered by this application, ubiquitous in offices throughout the public and private sectors, say more about the nature of our superficial society than the state of information technology. Indeed the term has embedded itself into our everyday vocabulary to such an extent that for many it may mean an indispensable multipurpose programme (many use it for desktop publishing or drafting web pages) or a projector they may use to display the results on a large screen. In effect PowerPoint draws on the resources of other applications either integral to the operating system or the Office suite, to juggle multimedia and display it in a series of slides rather than pages. Besides adding gratuitous custom animations of text and images floating over the screen, little functionality is native to PowerPoint. In the process, it encourages the dumbing down of messages to the lowest common denominator with no more than seven bullet points recommended on each page, a virtual collection of soundbites. I see some uses for computer projectors in many teaching situations as a replacement for overhead projectors, blackboards and whiteboards, but they don't need Micro$oft PowerPoint to work!

Open Office Impress does almost everything PowerPoint can do, but lacks the wealth of templates supplied with MS Office. For this you'll need to buy Sun Office 8.0 or rely on a third-party vendor. Admittedly I could not work out how to achieve the typewriter effect, but then again you probably just need a downloadable macro to perform this trick. Unlike the market leader, OpenOffice Impress exports to Flash, the de facto web-optimised multimedia integration plug-in. Hopefully, at some stage Web browsers will offer native support for XML-based SVG (Scalable Vector Graphics) and SMIL (Synchronised Multimedia Interface Language). Indeed OpenOffice Draw 2.0 will also let you save any graphic as an XML-compliant SVG file. Microsoft may provide a plug-in to enable anyone with Internet Explorer to view PowerPoint presentations within their browser window, but file sizes are way too big. The other day I tried to view a short presentation with 20 slides (which once downloaded displayed fine in Open Office Impress), but occupying 13.8MB. All we need a user-friendly application to resize and export your digital snaps and video clips to this format with text captions and PowerPoint could well prove a passing fad.

The impact of this gizmo has attracted the attention of numerous social commentators. Edward Tufte has even written a book, The Cognitive Style of Power Point.

Database Integration

Originally considered a relative weakness of the OpenOffice suite, version 2.0's offering beats Access any day, offering not only the native dBase format but allowing full compatibility with open-source MySQL and Microsoft Access via ODBC and JDBC drivers, allowing users to update databases on a remote server. You can also import address books from Mozilla Thunderbird, Netscape Messenger and even MS Outlook, within the main Writer interface. If you stick to Microsoft, you'd need their professional MS-SQL Server to do that!

Why do people still use Microsoft Office?

If OpenOffice is so good and alternatives such as Corel PerfectOffice are cheaper, why would anyone want to spend over £100 on MS Office? Microsoft's virtual monopoly on personal computer operating systems and its marketing and PR clout have enabled it to persuade politicians, IT managers and the general public that their product is not only indispensable but migrating to another would prove costly. Their strongest argument is that retraining staff would prove more expensive than an upgrade. But how were staff trained in the first place? Were they trained in word-processing, spreadsheets, desktop publishing, database management and networking or were they trained to use Microsoft products to perform these tasks? In this regard, we could rename The European Computer Driving Licence a Microsoft Product Familiarisation Course. In reality, most users will only notice that some features are accessed from different menus or icons, but it's easy to change default shortcuts to those used in MS Office. It took me a little while to discover that FontWorks (WordArt for Microsoft aficionados) can be accessed by clicking on the drawing icon within OpenOffice writer and then clicking on the FontWorks icon.

Prior to Office 2000, many may have installed a friend's copy of MS Office with the registration key affixed to the CD case. Now all MS products require product activation limited to one machine. This may seriously deter piracy, but has also led to a significant decrease in the number of people upgrading in the real world. There are still far more Office 97 users in the UK than owners of Office 2000 or 2003. Can you seriously justify such an outlandish expense with no tangible benefits over free open-source software? And if you really want to pay, you can always purchase Sun's jazzed up Star Office 8.0 for under £80. If your local council implements a Microsoft-only policy, let them know they're wasting our money to enrich the obscenely wealthy. We should treat operating systems and essential productivity software as a public good in the same way as libraries and schools. Computing is simply too pervasive for us to let one multinational corporation enjoy a near-total monopoly.