Anyone knows a free Delphi 32 Library to import/export csv files like .NET FileHelpers

quasiceo 2013-11-12

展开全文

Anyone knows a free Delphi 32 Library to import/export csv files like .NET FileHelpers

up vote 4 down vote favorite

Anyone knows a free Delphi 32 Library to import/export csv,txt files like .NET FileHelpers?

if your answer tends to be: "why this lazy guy dont write a simple parser.", consider this 5 minutes reading - http:///csv_trouble.asp

Thanks in Advance

share|improve this question

asked Jan 31 '11 at 17:59

José Eduardo
130110

Actually the task is split into two: first write a tokenizer that will count for quotes and spaces (throw away regexes, most of the time you can live easier without them), then use the tokenizer to parse the CSV line by line. Writing a correct and complete tokenizer took me an hour or so. – Eugene Mayevski 'EldoS Corp Jan 31 '11 at 18:11

and exporting is even easier. I wrote my own exporter recently with a goal of high performance. – David Heffernan Jan 31 '11 at 18:36

why you linked page ranting about those obvious things? actually looking into the problem takes less time than seeking random advice in teh internets. – Free Consulting Feb 1 '11 at 2:38

exporting is definitely easier than parsing. To anybody who thinks CSV parsing is trivial; Send me your CSV parser, and I'll show you ten real-world CSV input files that will break your parser. – Warren P Feb 13 '11 at 3:55

@Eugene: im probably missing something here but why the need to count spaces? – Simon Mar 22 '11 at 10:25

show 1 more comment

6 Answers

active oldest votes

up vote 12 down vote accepted

I wrote a Dataset (TTable-like object) for Jedi project called TJvCsvDataSet. It also contains a stream class that will very quickly load a file on disk, and parse it line by line, using the correct escape rules required for CSV files, even files that include carriage-return/line-feed codes encoded within a field.

You just drop it on your form, and set the FieldDefs property like this:

CsvFieldDef=ABC:%,DEF:#,GHI:$,....

There are special codes for integer, floating point, iso date-time, and other fields. It even allows you to map a wide-string field to a utf8 field in a CSV file.

There is a designtime property editor to save you from having to declare the CSV Field Defs using the syntax above, instead you can just pick visually what the column types are.

If you don't set up a CSV Field Def, it merely maps whatever exists in the file to string-type fields.

Jedi JVCL: http://jvcl./

JvCsvDataSet Docs:

http://help./unit.php?Id=3107

http://help./item.php?Id=174896

enter image description here

share|improve this answer

edited Feb 1 '11 at 0:51

answered Feb 1 '11 at 0:40

Warren P
22.1k556126

P, is it capable or easy to implement a "N lines" layout. i mean first line has one layout, second another, like a master detail? – José Eduardo Feb 3 '11 at 20:19

The component is tolerant of missing fields and values. That is as far as it goes. You can have a CSV database, and you can keep adding fields to it, and it will load a file that was created before that field was added. It doesn't even care if you reorder the fields. It is pretty darn smart. But it's not magic, and it won't have completely different sets of data on each line, no. – Warren P Feb 13 '11 at 3:51

If you wanted to implement a different layout on each line, you would not want a TTable component. But perhaps you might find it easy to start by grabbing my stream class, and my CSV-splitter class, and then wrap those into whatever it is that you wanted to do, that isn't csv exactly, but is delimited text. – Warren P Feb 13 '11 at 3:56

P, Neurons burning... thanks – José Eduardo Feb 14 '11 at 11:13

Nice component Warren. Certainly very fast and reliable. Just curious - how to optimize the memory usage? Just by defining the fields? Or is there an optimum value (way to calculate) for TextStreamBuffer? (D7) – Simon Mar 23 '11 at 0:26

show 2 more comments

up vote 7 down vote

It's pretty basic, but TStringList has Delimiter, DelimitedText, and QuoteChar properties, which address some of these issues.

Updated to add, per comments: Don't be tempted by the CommaText property, which has some surprising limitations for backwards compatibility with archaic versions of Delphi.

share|improve this answer

edited Jan 31 '11 at 19:24

answered Jan 31 '11 at 18:19

Craig Stuntz
89k7150196

-1 this gets nowhere near doing the job – David Heffernan Jan 31 '11 at 18:37

Craig, does TStringList solves all of the many mini quirks CSV has? (strings inside quotes, newlines inside quotes, missing values, etc.) – Leonardo Herrera Jan 31 '11 at 18:44

@David, that's just wrong. The "space" issue is true for CommaText, but not for DelimitedText. Which is why I didn't suggest it, even for CSV. @Leandro, some, but not all. Whether it's suitable depends on your needs, but it has the particular advantage of being built in. – Craig Stuntz Jan 31 '11 at 18:56

@Craig I fixed a typo in your answer and thus also gave me an opportunity to remove my down-vote. – David Heffernan Jan 31 '11 at 19:07

@David it is certainly true that CommaText and DelimitedText have (surprisingly) different rules and I should probably note that in my answer. – Craig Stuntz Jan 31 '11 at 19:23

show 8 more comments

up vote 3 down vote

My framework has code for this in the CsiTextStreamsUnt.pas file (see http://www./framework_delphi.htm)

share|improve this answer

edited Feb 7 '11 at 7:29

answered Jan 31 '11 at 22:27

Misha
1,9621613

Does it handle carriage-return linefeeds embedded inside a csv row? – Warren P Feb 1 '11 at 0:40

I intentionally removed this functionality about a year ago because there is no standard way to handle records across multiple lines for both file reading and writing. Instead I introduced an overridable method: – Misha Feb 1 '11 at 1:28

Trying again: I intentionally removed this functionality about a year ago because there is no standard way to handle records spanning multiple lines for both reading and writing (and really there is no point tryng). Instead I introduced two overrideable methods, one for reading, one for writing, that enable a developer to add any type of multi-line record handling that they require. – Misha Feb 1 '11 at 1:39

Agree, Misha, there's no one standard way, but there is one most common way that it is seen. That is what I call the "defacto" rule. Whatever Excel does. So, maybe you could enable 'excel flag'. – Warren P Feb 13 '11 at 3:53

Good point, although given that I have never had to use this in any real-world scenarios I put it under the YAGNI principle, and if I do need it, I will override my methods. I have at least 20,000 LOC in just implementing utility routines and classes that I have actually used in live applications over the last 10 years - it would be way too much effort to add things that I might not need ;-) – Misha Feb 13 '11 at 9:41

up vote 0 down vote

Following the VCL TXMLTransform logic, I wrote a TCsvTransform class helper that translates a .csv format structure to /from a TClientDataSet.
For more details about TCsvTransform, cf http://didier.cabale./delphi.htm#uCsvTransform.
NB: I set the same field type symbols as Warren's TJvCsvDataSet

share|improve this answer

edited Jan 12 at 19:46

answered Jan 12 at 13:35

Didier Cabalé
11

up vote 0 down vote

My functions

function ParseCSVString(s: string; const delimiter: Char = ','; const enclosure: Char = '"'): TStrings;
var
    i,len: Integer;
    f: string;
    inQuoted: Boolean;
begin
    Result := TStringList.Create;
    len := Length(s);
    if len = 0 then Exit;
    //Test,Test;"Test;Test";"Test""Test";;
    f := '';
    inQuoted := False;
    i:=0;
    while i < len do
    begin
        Inc(i);
        if s[i] = enclosure then
        begin
            if inQuoted and (i<len) and (s[i+1] = enclosure) then
            begin
                f := f + '"';
                i:=i+1;
            end
            else
                inQuoted := not inQuoted;
        end
        else if s[i] = delimiter then
        begin
            if inQuoted then
                f := f+s[i]
            else
            begin
                Result.Add(f);
                inQuoted := false;
                f := '';
            end;
        end
        else
            f := f + s[i];
    end;
    Result.Add(f);
end;

function EscapeCSVString(s: string; const delimiter: Char = ','; const enclosure: Char = '"'): string;
var
    i: Integer;
begin
    Result := StringReplace(s,enclosure,enclosure+enclosure,[rfReplaceAll]);
    if (Pos(delimiter,s) > 0) OR (Pos(enclosure,s) > 0) then
        Result := enclosure+Result+enclosure;
end;

share|improve this answer

answered Jan 19 at 15:24

Sergey Shuchkin
34125

up vote 0 down vote

Here is one I wrote that reads CSV files, it handles carriage returns inside quotes as well.

unit CSV;

interface
uses
  SysUtils, Generics.Collections, IOUtils;

type
  TParseState = (psRowStart, psFieldStart, psUnquotedFieldData,
    psQuotedFieldData, psQFBranch, psEndOfQuotedField, psQFEndSearch,
    psEndOfLine, psEndOfFile);

  TCSVField = class
  strict private
    FText: String;
  public
    constructor Create;
    destructor Destroy; override;
    property Text: string read FText write FText;
    procedure Clear;
  end;

  TCSVFieldList = class(TObjectList<TCSVField>)
  public
    function AddField(const AText: string): TCSVField;
    procedure ClearFields;
  end;

  TCSVRow = class
  strict private
    FFields: TCSVFieldList;
  public
    constructor Create;
    destructor Destroy; override;
    property Fields: TCSVFieldList read FFields;
  end;

  TCSVParser = class
  strict private
    FRow: TCSVRow;
    FContent: String;
    FCIdx: Integer;
    FParseState: TParseState;
    FEOF: Boolean;
    procedure ParseRow;
  public
    function First: Boolean;
    function EOF: Boolean;
    function Next: Boolean;
    procedure OpenFile(AFileName: String);
    procedure OpenText(const AText: string);
    property Row: TCSVRow read FRow;
    constructor Create;
    destructor Destroy; override;
  end;

implementation



{implementation of TCSVField}

procedure TCSVField.Clear;
begin
  FText:= '';
end;

constructor TCSVField.Create;
begin
  inherited Create;
end;

destructor TCSVField.Destroy;
begin
  inherited Destroy;
end;

{implementation of TCSVRow}

constructor TCSVRow.Create;
begin
  inherited Create;
  FFields:= TCSVFieldList.Create;
end;

destructor TCSVRow.Destroy;
begin
  FreeAndNil(FFields);
  inherited Destroy;
end;

{implementation of TCSVParser}

constructor TCSVParser.Create;
begin
  inherited Create;
  FRow:= TCSVRow.Create;
  FCIdx:= 1;
  FParseState:= psEndOfFile;
end;

destructor TCSVParser.Destroy;
begin
  FreeAndNil(FRow);
  inherited Destroy;
end;



function TCSVParser.EOF: Boolean;
begin
  Result:= FEOF;
end;

function TCSVParser.First: Boolean;
begin
  FEOF:= False;
  FCIdx:= 1;
  FParseState:= psRowStart;
  Result:= Next;
end;

function TCSVParser.Next: Boolean;
begin
  if not EOF then
    ParseRow;
  Result:= not EOF;
end;

procedure TCSVParser.OpenFile(AFileName: String);
begin
  OpenText(TFile.ReadAllText(AFileName));
end;

procedure TCSVParser.OpenText(const AText: string);
begin
  FContent:= AText;
  FRow.Fields.Clear;
  First;
end;

procedure TCSVParser.ParseRow;
var
  FieldIdx: Integer;

  procedure AddField(const AText: string);
  begin
    if FieldIdx > FRow.Fields.Count-1 then
      FRow.Fields.AddField(AText)
    else
      FRow.Fields[FieldIdx].Text:= AText;

    Inc(FieldIdx);
  end;

var
  FieldText: string;
  Curr: Char;
  LastIdx: Integer;
begin
  if FParseState = psEndOfFile then
  begin
    FEOF:= True;
    FRow.Fields.ClearFields;
    Exit;
  end;

  if not (FParseState in [psRowStart]) then
    raise Exception.Create('ParseRow requires ParseState = psRowState');

  FieldIdx:= 0;
  FRow.Fields.ClearFields;
  LastIdx:= Length(FContent);
  while True do
  begin
    case FParseState of
      psRowStart:
        begin
          if FCIdx > LastIdx then
          begin
            FEOF:= True;
            FParseState:= psEndOfFile;
          end
          else
          begin
            FParseState:= psFieldStart;
          end;
          Dec(FCIdx); // do not consume
        end;
      psFieldStart:
        begin
          FieldText:= '';
          if FContent[FCIdx] = '"' then
            FParseState:= psQuotedFieldData
          else
          begin
            FParseState:= psUnquotedFieldData;
            Dec(FCIdx); // do not consume
          end;
        end;
      psUnquotedFieldData:
        begin
          if FCIdx > LastIdx then
          begin
            AddField(FieldText);
            FParseState:= psEndOfFile;
          end
          else
          begin
            Curr:= FContent[FCIdx];
            case Curr of
              #13, #10:
                begin
                  AddField(FieldText);
                  FParseState:= psEndOfLine;
                end;
              ',':
                begin
                  AddField(FieldText);
                  FParseState:= psFieldStart;
                end;
            else
              FieldText:= FieldText + Curr;
            end;
          end;
        end;
      psQuotedFieldData:
        begin
          if FCIdx > LastIdx then
            raise Exception.Create('EOF in quoted Field.');

          Curr:= FContent[FCIdx];
          if Curr = '"' then
            FParseState:= psQFBranch
          else
            FieldText:= FieldText + Curr;
        end;
      psQFBranch:
        begin
          Curr:= FContent[FCIdx];
          if Curr = '"' then
          begin
            FieldText:= FieldText + Curr;
            FParseState:= psQuotedFieldData;
          end
          else
          begin
            AddField(FieldText);
            FParseState:= psEndOfQuotedField;
            Dec(FCIdx); // do not consume
          end;
        end;
      psEndOfQuotedField:
        begin
          if FCIdx > LastIdx then
            FParseState:= psEndOfFile
          else
          begin
            Curr:= FContent[FCIdx];
            if CharInSet(Curr, [#13, #10]) then
              FParseState:= psEndOfLine
            else
            begin
              FParseState:= psQFEndSearch;
              Dec(FCIdx); // do not consume
            end;
          end;
        end;
      psQFEndSearch:
        begin
          if FCIdx > LastIdx then
            FParseState:= psEndOfFile
          else
          begin
            Curr:= FContent[FCIdx];
            if CharInSet(Curr, [#13, #10]) then
              FParseState:= psEndOfLine
            else if Curr = ',' then
              FParseState:= psFieldStart;

            // skips white space or other until end
          end;
        end;
      psEndOfLine:
        begin
          if FCIdx > LastIdx then
            FParseState:= psEndOfFile
          else
          begin
            Curr:= FContent[FCIdx];
            if not CharInSet(Curr, [#13, #10]) then
            begin
              FParseState:= psRowStart;
              Break; // exit loop, we are done with this row
            end;
          end;
        end;
      psEndOfFile:
        begin
          Break;
        end;
    end;
    Inc(FCIdx);
  end;
end;


{ TCSVFieldList }

function TCSVFieldList.AddField(const AText: string): TCSVField;
begin
  Result:= TCSVField.Create;
  Add(Result);
  Result.Text:= AText;
end;

procedure TCSVFieldList.ClearFields;
var
  F: TCSVField;
begin
  for F in Self do
    F.Clear;
end;

end.