A summary of some resources available online for programming in C# to produce software that will read data from files stored
in Adobe portable document format (pdf).
Step-by-step instructions and sample C# code are at the bottom of the page .
Firstly, what pdf is
the Adobe PDF Technology Center PDF Reference webpage includes links to the definitive pdf specification, including the PDF
Reference and Related Documentation (over 15MB). Adobe publishes the full
specification to "foster the creation of an ecosystem around the PDF format"
C# Resources for reading PDF files
A PDF Forms Parser by Michael Ganss addresses the problem of filling data into a pdf form programmatically (for example, with generated content or data read from a database). He writes, "The parser is not a full- fledged PDF parser but rather a small, one- class parser that can be dropped into any project where form field parsing is necessary
instead of a whole library that adds a lot of overhead. Although the parser supports all types of PDF objects except for streams, it
parses just the form fields of a PDF file by looking at the AcroForm dictionary." The page
links to a 22.3KB source code download. (Oct 2004)
iText is a library that enables you to generate PDF files on the fly.
The documentation says,
"the iText classes are very useful for people who need to generate read-only, platform independent documents containing text, lists,
tables and images; or who want to perform specific manipulations on existing PDF documents." It is written for use in Java systems but there is a .NET port available:
iTextSharp (written in C#), implemented as an assembly and downloadable from this page on
SourceForge (Nov 2007) – see iTextSharp code example below .
Linda Fahmy explains how to
create Arabic pdf files (May
2007)
iTextSharp Tutorial Codes(C#)
gives 100 examples that teach
you how to use iTextSharp
(on RubyPDF blog)
PDFBox is a Java library (see sub-bullet for how to use it in C# .NET) that lets you create new PDF documents, manipulate existing
documents and extract content from documents. PDFBox also includes several command line utilities.
Functionality includes;
PDF to text extraction;
Merge PDF Documents;
PDF Document Encryption/Decryption;
Lucene Search Engine Integration; Fill in form data FDF and XFDF; Create a PDF from a text file;
Create images from PDF pages; Print a PDF.
PDFBox can be downloaded from
SourceForge.
Read from a PDF file using
C# on Lucian's Weblog shows
you how to use PDFBox with
IKVM in a C# .NET project.
IKVM is an implementation of
Java for Mono and the
Microsoft .NET Framework,
and includes a Java Virtual
Machine implemented
in .NET, a .NET implementation of the Java class libraries and tools that.enable Java and .NET
interoperability. First download IKVM and PDFBox.
Then in Visual Studio .NET you need to add two dlls to your project: these are IKVM.GNU.Classpath.dll and PDFBox-0.7.3.dll. You then need to copy FontBox-0.1.0-dev.dll and IKVM.Runtime.dll into your project's bin
directory. A good place to start is then a simple example of a three-line C# program to read text from a pdf file as given on Lucian's
Weblog. The comments on
the page address many of the
common problems and errors
that users found (including errors about bcprov- jdk14-132.dll, the error message "The type initializer
for 'java.io.File' threw an
exception", TypeLoadException, issueswith tables, fonts, images,
etc) Some Open Source PDF Libraries in C# here include iTextSharp, PDFsharp , Report.NET , SharpPDF , ASP.NET fo PDF , PDF Clown , PDFjet Open Source Edition. Winnovative Software Solutions produce a number of utilities for sale:
Winnovative HTML to PDF Converter Library for .NET
Winnovative PDF Creator
Library for .NET
Winnovative RTF to PDF
Library for .NET
Winnovative PDF Merge
Library for .NET
Winnovative PDF Split Library
for .NET
Winnovative PDF Security
Library for .NET
Winnovative PDF Viewers
ASP.NET and Windows Forms
Winnovative Chart Control
for .NET
Winnovative PDF Tools
for .NET
Winnovative Reporting Tools
for .NET
Extracting images from PDF files using C# Winnovative Software Solutions produce PDF Images Extractor for .NET is a .NET 2.0 library enabling you to extract images from a PDF file
in formats such as bmp, png, jpeg, etc.
It includes samples of C# code. An evaluation version can be downloaded and the full product can be purchased from their website. if you want to extract image files using a desktop utility instead of writing C# code,
FileBuzz feature a shareware product called A- PDF Image Extractor v1.0.0 which can extract
image files from a single PDF file or a batch of PDF files. It can save images in TIFF, JPEG,GIF, BMP, PNG, TGA, PCX, ICO, JP2 (JPEG
2000) and DCX format, and supports a variety of image filters used in PDF files including
LZWDecode, FlateDecode RunLengthDecode CCITTFaxDecode (TIFF), JBIG2Decode (JBig2),
DCTDecode (JPEG), and JPXDecode (JPEG 2000).
Writing data to a pdf file:
Chris Hornberger wrote on Jul 2 2003, 6:59 am:
"Create a Crystal report with the information you want on it, then simply export it to PDF. The fact
that you're using C#, I assume you're also using VS Studio.NET and hence, have Crystal too. This
will allow you to create your PDF file. Another choice is to spring for Adobe Pagemill and print
to the PDF file format."
The brief article "Microsoft Visual Studio.NET:
Crystal Reports" by Mujtaba Khambatti explains the benefits of Crystal Reports, designing a
report, and using Crystal Reports in projects you create.
There is comprehensive documentation on .NET
Crystal Reports in the Microsoft Developer Network in the section MSDN > MSDN Library >
Development Tools and Languages > Visual
Studio .NET > Developing with Visual Studio .NET
> Designing Distributed Applications > Crystal
Reports Other Assorted PDF Utilities:
Some free utilities are available for download -
instead of writing your own software, this section
may save you the trouble of re-inventing the
wheel...
The RubyPDF Blog contains an assortment of
free utilities for manipulating pdf files:
pdf cropper to remove white
margins
BookmarkExtractor to extract
all bookmarks from a pdf file
PDF2PPT converts a pdf to a
PowerPoint file
Pdfrotate rotates every page a
chosen multiple of 90°
pdfselect extracts pages,
splits a pdf or reverses it
PDF N-UP Maker builds a n-
up pdf or booklet
WOA PDF-Excel is a free utility that lets you
convert pdf files into Excel; customisable, it
lets you extract data from a folder of pdf files
into a single spreadsheet (eg a folder of
invoices, etc). Produced by Wilkie Office
Automation (2006)
Other useful links:
Some background reading...
A useful document How-To-Select a PDF
Component for .NET™ covers many topics,
including:
PDF Viewers
PDF Generators
XML-to-PDF Processors
PDF Print Drivers
PDF Editors
PDF Forms Products
A worked example – C# code to read a pdf
document properties:
Here are step-by-step instructions for using C#
and Visual Studio to read the properties of a pdf
file using iTextSharp:
(1) Download the
most recent
version of
iTextSharp from
http://
sourceforge.net/
projects/
itextsharp/ and
unzip the file
(2) Create a new
project in Visual
Studio
(screenshots are
from VS2005 and
I created a
console
application) –
now you need to
add a reference
to the iTextSharp
dll
In the solution
explorer, right
click on the
project name and
select Add
Reference…
(3) Click the
Browse tab
(4) … and now
navigate to the
folder into which
you unzipped the
iTextSharp.dll
file, click it, then
click the OK
button
(5) You will now
see itextsharp
listed in the
solution explorer
under
References.
Now you can use any of the techniques illustrated
in over 200 tutorial files that you also unzipped
from the download. Here's a simple bit of code
that reads in the properties of a pdf file that's on
the web (chosen at random) at:
http://www.chinehamchat.com/
Chineham_Chat_Advertisements.pdf
(note: the sample pdf file may have changed
since I ran my program)
using System;
using System.Collections.Generic;
using System.Text;
using iTextSharp.text;
using iTextSharp.text.pdf;
namespace PdfProperties
{
class Program
{
static void Main( string[] args)
{
// create a reader
(constructor overloaded for path to local
file or URL)
PdfReader reader
= new PdfReader
( "http://www.chinehamchat.com/
Chineham_Chat_Advertisements.pdf" );
// total number of pages
int n =
reader.NumberOfPages;
// size of the first page
Rectangle psize =
reader.GetPageSize(1);
float width = psize.Width;
float height = psize.Height;
Console.WriteLine( "Size of
page 1 of {0} => {1} × {2}" , n, width,
height);
// file properties
Dictionary< string, string>
infodict = reader.Info;
foreach
( KeyValuePair < string, string> kvp in
infodict)
Console.WriteLine
(kvp.Key + " => " + kvp.Value);
}
}
}
from which the output (eventually – you need to
give time for the pdf to download) is:
Size of page 1 of 24 => 421 × 595
ModDate => D:20120122082532Z
CreationDate => D:20101117141712Z
Title => Chineham Chat Advertisement
Supplement
Creator => PScript5.dll Version
5.2.2
Author => Chineham Chat Magazine
Keywords => Chineham Chat, Magazine,
Basingstoke, Advertisements
Subject => Adverts from the Chineham
Chat magazine, distributed free to
all households in Chineham,
Basingstoke, Hampshire, UK
Producer => Acrobat Distiller 4.05
for Windows
While you’re typing in the code, you'll notice
when you type reader. that Intellisense gives
you a long list of methods and properties –
evidence of the breadth of functionality in this
library.
How to prepare your PC for the Windows 10 upgrade Source: WC
It's been a long ride, but at the end of July, millions of users around the world will finally get their hands on Windows 10 . In this new release, Microsoft is going back to the drawing board to make Windows better. The company is doing this by bringing back many of the familiarities of Windows 7 with a mix of Windows 8.1 functionality, brand new features, such as Cortana , Settings, and universal Windows apps, and changing the way people upgrade to the next operating system. So far things are looking very good. Below you will find everything you need to know to prepare your computer to upgrade to Windows 10. From making sure all your system's components are compatible with the new operating system, to preparing a recovery plan, to making the necessary changes to prevent possible hiccups during the process. Making sure Windows 10 is compatible with your PC One of the efforts to put Windows 10 on a billion devices is by making the operating system free for the first ...
Comments
Post a Comment