csv-3.3.4/ 0000755 0000041 0000041 00000000000 15000146530 012336 5 ustar www-data www-data csv-3.3.4/NEWS.md 0000644 0000041 0000041 00000046576 15000146530 013456 0 ustar www-data www-data # News
## 3.3.4 - 2025-04-13
### Improvements
* `csv-filter`: Removed an experimental command line tool.
* GH-341
## 3.3.3 - 2025-03-20
### Improvements
* `csv-filter`: Added an experimental command line tool to filter a CSV.
* Patch by Burdette Lamar
### Fixes
* Fixed wrong EOF detection for `ARGF`
* GH-328
* Reported by Takeshi Nishimatsu
* Fixed a regression bug that `CSV.open` rejects integer mode.
* GH-336
* Reported by Dave Burgess
### Thanks
* Takeshi Nishimatsu
* Burdette Lamar
* Dave Burgess
## 3.3.2 - 2024-12-21
### Fixes
* Fixed a parse bug with a quoted line with `col_sep` and an empty
line. This was introduced in 3.3.1.
* GH-324
* Reported by stoodfarback
### Thanks
* stoodfarback
## 3.3.1 - 2024-12-15
### Improvements
* `CSV.open`: Changed to detect BOM by default. Note that this isn't
enabled on Windows because Ruby may have a bug. See also:
https://bugs.ruby-lang.org/issues/20526
* GH-301
* Reported by Junichi Ito
* Improved performance.
* GH-311
* GH-312
* Patch by Vladimir Kochnev
* `CSV.open`: Added support for `StringIO` as an input.
* GH-300
* GH-302
* Patch by Marcelo
* Added a built-in time converter. You can use it by `converters:
:time`.
* GH-313
* Patch by Bart de Water
* Added `CSV::TSV` for tab-separated values.
* GH-272
* GH-319
* Reported by kojix2
* Patch by Jas
### Thanks
* Junichi Ito
* Vladimir Kochnev
* Marcelo
* Bart de Water
* kojix2
* Jas
## 3.3.0 - 2024-03-22
### Fixes
* Fixed a regression parse bug in 3.2.9 that parsing with
`:skip_lines` may cause wrong result.
## 3.2.9 - 2024-03-22
### Fixes
* Fixed a parse bug that wrong result may be happen when:
* `:skip_lines` is used
* `:row_separator` is `"\r\n"`
* There is a line that includes `\n` as a column value
Reported by Ryo Tsukamoto.
GH-296
### Thanks
* Ryo Tsukamoto
## 3.2.8 - 2023-11-08
### Improvements
* Added `CSV::InvalidEncodingError`.
Patch by Kosuke Shibata.
GH-287
### Thanks
* Kosuke Shibata
## 3.2.7 - 2023-06-26
### Improvements
* Removed an unused internal variable.
[GH-273](https://github.com/ruby/csv/issues/273)
[Patch by Mau Magnaguagno]
* Changed to use `https://` instead of `http://` in documents.
[GH-274](https://github.com/ruby/csv/issues/274)
[Patch by Vivek Bharath Akupatni]
* Added prefix to a helper module in test.
[GH-278](https://github.com/ruby/csv/issues/278)
[Patch by Luke Gruber]
* Added a documentation for `liberal_parsing: {backslash_quotes: true}`.
[GH-280](https://github.com/ruby/csv/issues/280)
[Patch by Mark Schneider]
### Fixes
* Fixed a wrong execution result in documents.
[GH-276](https://github.com/ruby/csv/issues/276)
[Patch by Yuki Tsujimoto]
* Fixed a bug that the same line is used multiple times.
[GH-279](https://github.com/ruby/csv/issues/279)
[Reported by Gabriel Nagy]
### Thanks
* Mau Magnaguagno
* Vivek Bharath Akupatni
* Yuki Tsujimoto
* Luke Gruber
* Mark Schneider
* Gabriel Nagy
## 3.2.6 - 2022-12-08
### Improvements
* `CSV#read` consumes the same lines with other methods like
`CSV#shift`.
[[GitHub#258](https://github.com/ruby/csv/issues/258)]
[Reported by Lhoussaine Ghallou]
* All `Enumerable` based methods consume the same lines with other
methods. This may have a performance penalty.
[[GitHub#260](https://github.com/ruby/csv/issues/260)]
[Reported by Lhoussaine Ghallou]
* Simplify some implementations.
[[GitHub#262](https://github.com/ruby/csv/pull/262)]
[[GitHub#263](https://github.com/ruby/csv/pull/263)]
[Patch by Mau Magnaguagno]
### Fixes
* Fixed `CSV.generate_lines` document.
[[GitHub#257](https://github.com/ruby/csv/pull/257)]
[Patch by Sampat Badhe]
### Thanks
* Sampat Badhe
* Lhoussaine Ghallou
* Mau Magnaguagno
## 3.2.5 - 2022-08-26
### Improvements
* Added `CSV.generate_lines`.
[[GitHub#255](https://github.com/ruby/csv/issues/255)]
[Reported by OKURA Masafumi]
[[GitHub#256](https://github.com/ruby/csv/pull/256)]
[Patch by Eriko Sugiyama]
### Thanks
* OKURA Masafumi
* Eriko Sugiyama
## 3.2.4 - 2022-08-22
### Improvements
* Cleaned up internal implementations.
[[GitHub#249](https://github.com/ruby/csv/pull/249)]
[[GitHub#250](https://github.com/ruby/csv/pull/250)]
[[GitHub#251](https://github.com/ruby/csv/pull/251)]
[Patch by Mau Magnaguagno]
* Added support for RFC 3339 style time.
[[GitHub#248](https://github.com/ruby/csv/pull/248)]
[Patch by Thierry Lambert]
* Added support for transcoding String CSV. Syntax is
`from-encoding:to-encoding`.
[[GitHub#254](https://github.com/ruby/csv/issues/254)]
[Reported by Richard Stueven]
* Added quoted information to `CSV::FieldInfo`.
[[GitHub#254](https://github.com/ruby/csv/pull/253)]
[Reported by Hirokazu SUZUKI]
### Fixes
* Fixed a link in documents.
[[GitHub#244](https://github.com/ruby/csv/pull/244)]
[Patch by Peter Zhu]
### Thanks
* Peter Zhu
* Mau Magnaguagno
* Thierry Lambert
* Richard Stueven
* Hirokazu SUZUKI
## 3.2.3 - 2022-04-09
### Improvements
* Added contents summary to `CSV::Table#inspect`.
[GitHub#229][Patch by Eriko Sugiyama]
[GitHub#235][Patch by Sampat Badhe]
* Suppressed `$INPUT_RECORD_SEPARATOR` deprecation warning by
`Warning.warn`.
[GitHub#233][Reported by Jean byroot Boussier]
* Improved error message for liberal parsing with quoted values.
[GitHub#231][Patch by Nikolay Rys]
* Fixed typos in documentation.
[GitHub#236][Patch by Sampat Badhe]
* Added `:max_field_size` option and deprecated `:field_size_limit` option.
[GitHub#238][Reported by Dan Buettner]
* Added `:symbol_raw` to built-in header converters.
[GitHub#237][Reported by taki]
[GitHub#239][Patch by Eriko Sugiyama]
### Fixes
* Fixed a bug that some texts may be dropped unexpectedly.
[Bug #18245][ruby-core:105587][Reported by Hassan Abdul Rehman]
* Fixed a bug that `:field_size_limit` doesn't work with not complex row.
[GitHub#238][Reported by Dan Buettner]
### Thanks
* Hassan Abdul Rehman
* Eriko Sugiyama
* Jean byroot Boussier
* Nikolay Rys
* Sampat Badhe
* Dan Buettner
* taki
## 3.2.2 - 2021-12-24
### Improvements
* Added a validation for invalid option combination.
[GitHub#225][Patch by adamroyjones]
* Improved documentation for developers.
[GitHub#227][Patch by Eriko Sugiyama]
### Fixes
* Fixed a bug that all of `ARGF` contents may not be consumed.
[GitHub#228][Reported by Rafael Navaza]
### Thanks
* adamroyjones
* Eriko Sugiyama
* Rafael Navaza
## 3.2.1 - 2021-10-23
### Improvements
* doc: Fixed wrong class name.
[GitHub#217][Patch by Vince]
* Changed to always use `"\n"` for the default row separator on Ruby
3.0 or later because `$INPUT_RECORD_SEPARATOR` was deprecated
since Ruby 3.0.
* Added support for Ractor.
[GitHub#218][Patch by rm155]
* Users who want to use the built-in converters in non-main
Ractors need to call `Ractor.make_shareable(CSV::Converters)`
and/or `Ractor.make_shareable(CSV::HeaderConverters)` before
creating non-main Ractors.
### Thanks
* Vince
* Joakim Antman
* rm155
## 3.2.0 - 2021-06-06
### Improvements
* `CSV.open`: Added support for `:newline` option.
[GitHub#198][Patch by Nobuyoshi Nakada]
* `CSV::Table#each`: Added support for column mode with duplicated
headers.
[GitHub#206][Reported by Yaroslav Berezovskiy]
* `Object#CSV`: Added support for Ruby 3.0.
* `CSV::Row`: Added support for pattern matching.
[GitHub#207][Patch by Kevin Newton]
### Fixes
* Fixed typos in documentation.
[GitHub#196][GitHub#205][Patch by Sampat Badhe]
### Thanks
* Sampat Badhe
* Nobuyoshi Nakada
* Yaroslav Berezovskiy
* Kevin Newton
## 3.1.9 - 2020-11-23
### Fixes
* Fixed a compatibility bug that the line to be processed by
`skip_lines:` has a row separator.
[GitHub#194][Reported by Josef Šimánek]
### Thanks
* Josef Šimánek
## 3.1.8 - 2020-11-18
### Improvements
* Improved documentation.
[Patch by Burdette Lamar]
### Thanks
* Burdette Lamar
## 3.1.7 - 2020-08-04
### Improvements
* Improved document.
[GitHub#158][GitHub#160][GitHub#161]
[Patch by Burdette Lamar]
* Updated required Ruby version to 2.5.0 or later.
[GitHub#159]
[Patch by Gabriel Nagy]
* Removed stringio 0.1.3 or later dependency.
### Thanks
* Burdette Lamar
* Gabriel Nagy
## 3.1.6 - 2020-07-20
### Improvements
* Improved document.
[GitHub#127][GitHub#135][GitHub#136][GitHub#137][GitHub#139][GitHub#140]
[GitHub#141][GitHub#142][GitHub#143][GitHub#145][GitHub#146][GitHub#148]
[GitHub#148][GitHub#151][GitHub#152][GitHub#154][GitHub#155][GitHub#157]
[Patch by Burdette Lamar]
* `CSV.open`: Added support for `undef: :replace`.
[GitHub#129][Patch by Koichi ITO]
* `CSV.open`: Added support for `invalid: :replace`.
[GitHub#129][Patch by Koichi ITO]
* Don't run quotable check for invalid encoding field values.
[GitHub#131][Patch by Koichi ITO]
* Added support for specifying the target indexes and names to
`force_quotes:`.
[GitHub#153][Reported by Aleksandr]
* `CSV.generate`: Changed to use the encoding of the first non-ASCII
field rather than the encoding of ASCII only field.
* Changed to require the stringio gem 0.1.3 or later.
### Thanks
* Burdette Lamar
* Koichi ITO
* Aleksandr
## 3.1.5 - 2020-05-18
### Improvements
* Improved document.
[GitHub#124][Patch by Burdette Lamar]
### Fixes
* Added missing document files.
[GitHub#125][Reported by joast]
### Thanks
* Burdette Lamar
* joast
## 3.1.4 - 2020-05-17
### Improvements
* Improved document.
[GitHub#122][Patch by Burdette Lamar]
* Stopped to dropping stack trace for exception caused by
`CSV.parse_line`.
[GitHub#120][Reported by Kyle d'Oliveira]
### Fixes
* Fixed a bug that `:write_nil_value` or `:write_empty_value` don't
work with non `String` objects.
[GitHub#123][Reported by asm256]
### Thanks
* Burdette Lamar
* asm256
* Kyle d'Oliveira
## 3.1.3 - 2020-05-09
### Improvements
* `CSV::Row#dup`: Copied deeply.
[GitHub#108][Patch by Jim Kane]
### Fixes
* Fixed a infinite loop bug for zero length match `skip_lines`.
[GitHub#110][Patch by Mike MacDonald]
* `CSV.generate`: Fixed a bug that encoding isn't set correctly.
[GitHub#110][Patch by Seiei Miyagi]
* Fixed document for the `:strip` option.
[GitHub#114][Patch by TOMITA Masahiro]
* Fixed a parse bug when split charcter exists in middle of column
value.
[GitHub#115][Reported by TOMITA Masahiro]
### Thanks
* Jim Kane
* Mike MacDonald
* Seiei Miyagi
* TOMITA Masahiro
## 3.1.2 - 2019-10-12
### Improvements
* Added `:col_sep` check.
[GitHub#94][Reported by Florent Beaurain]
* Suppressed warnings.
[GitHub#96][Patch by Nobuyoshi Nakada]
* Improved documentation.
[GitHub#101][GitHub#102][Patch by Vitor Oliveira]
### Fixes
* Fixed a typo in documentation.
[GitHub#95][Patch by Yuji Yaginuma]
* Fixed a multibyte character handling bug.
[GitHub#97][Patch by koshigoe]
* Fixed typos in documentation.
[GitHub#100][Patch by Vitor Oliveira]
* Fixed a bug that seeked `StringIO` isn't accepted.
[GitHub#98][Patch by MATSUMOTO Katsuyoshi]
* Fixed a bug that `CSV.generate_line` doesn't work with
`Encoding.default_internal`.
[GitHub#105][Reported by David Rodríguez]
### Thanks
* Florent Beaurain
* Yuji Yaginuma
* Nobuyoshi Nakada
* koshigoe
* Vitor Oliveira
* MATSUMOTO Katsuyoshi
* David Rodríguez
## 3.1.1 - 2019-04-26
### Improvements
* Added documentation for `strip` option.
[GitHub#88][Patch by hayashiyoshino]
* Added documentation for `write_converters`, `write_nil_value` and
`write_empty_value` options.
[GitHub#87][Patch by Masafumi Koba]
* Added documentation for `quote_empty` option.
[GitHub#89][Patch by kawa\_tech]
### Fixes
* Fixed a bug that `strip; true` removes a newline.
### Thanks
* hayashiyoshino
* Masafumi Koba
* kawa\_tech
## 3.1.0 - 2019-04-17
### Fixes
* Fixed a backward incompatibility bug that `CSV#eof?` may raises an
error.
[GitHub#86][Reported by krororo]
### Thanks
* krororo
## 3.0.9 - 2019-04-15
### Fixes
* Fixed a test for Windows.
## 3.0.8 - 2019-04-11
### Fixes
* Fixed a bug that `strip: String` doesn't work.
## 3.0.7 - 2019-04-08
### Improvements
* Improve parse performance 1.5x by introducing loose parser.
### Fixes
* Fix performance regression in 3.0.5.
* Fix a bug that `CSV#line` returns wrong value when you
use `quote_char: nil`.
## 3.0.6 - 2019-03-30
### Improvements
* `CSV.foreach`: Added support for `mode`.
## 3.0.5 - 2019-03-24
### Improvements
* Added `:liberal_parsing => {backslash_quote: true}` option.
[GitHub#74][Patch by 284km]
* Added `:write_converters` option.
[GitHub#73][Patch by Danillo Souza]
* Added `:write_nil_value` option.
* Added `:write_empty_value` option.
* Improved invalid byte line number detection.
[GitHub#78][Patch by Alyssa Ross]
* Added `quote_char: nil` optimization.
[GitHub#79][Patch by 284km]
* Improved error message.
[GitHub#81][Patch by Andrés Torres]
* Improved IO-like implementation for `StringIO` data.
[GitHub#80][Patch by Genadi Samokovarov]
* Added `:strip` option.
[GitHub#58]
### Fixes
* Fixed a compatibility bug that `CSV#each` doesn't care `CSV#shift`.
[GitHub#76][Patch by Alyssa Ross]
* Fixed a compatibility bug that `CSV#eof?` doesn't care `CSV#each`
and `CSV#shift`.
[GitHub#77][Reported by Chi Leung]
* Fixed a compatibility bug that invalid line isn't ignored.
[GitHub#82][Reported by krororo]
* Fixed a bug that `:skip_lines` doesn't work with multibyte characters data.
[GitHub#83][Reported by ff2248]
### Thanks
* Alyssa Ross
* 284km
* Chi Leung
* Danillo Souza
* Andrés Torres
* Genadi Samokovarov
* krororo
* ff2248
## 3.0.4 - 2019-01-25
### Improvements
* Removed duplicated `CSV::Row#include?` implementations.
[GitHub#69][Patch by Max Schwenk]
* Removed duplicated `CSV::Row#header?` implementations.
[GitHub#70][Patch by Max Schwenk]
### Fixes
* Fixed a typo in document.
[GitHub#72][Patch by Artur Beljajev]
* Fixed a compatibility bug when row headers are changed.
[GitHub#71][Reported by tomoyuki kosaka]
### Thanks
* Max Schwenk
* Artur Beljajev
* tomoyuki kosaka
## 3.0.3 - 2019-01-12
### Improvements
* Migrated benchmark tool to benchmark-driver from benchmark-ips.
[GitHub#57][Patch by 284km]
* Added `liberal_parsing: {double_quote_outside_quote: true}` parse
option.
[GitHub#66][Reported by Watson]
* Added `quote_empty:` write option.
[GitHub#35][Reported by Dave Myron]
### Fixes
* Fixed a compatibility bug that `CSV.generate` always return
`ASCII-8BIT` encoding string.
[GitHub#63][Patch by Watson]
* Fixed a compatibility bug that `CSV.parse("", headers: true)`
doesn't return `CSV::Table`.
[GitHub#64][Reported by Watson][Patch by 284km]
* Fixed a compatibility bug that multiple-characters column
separator doesn't work.
[GitHub#67][Reported by Jesse Reiss]
* Fixed a compatibility bug that double `#each` parse twice.
[GitHub#68][Reported by Max Schwenk]
### Thanks
* Watson
* 284km
* Jesse Reiss
* Dave Myron
* Max Schwenk
## 3.0.2 - 2018-12-23
### Improvements
* Changed to use strscan in parser.
[GitHub#52][Patch by 284km]
* Improves CSV write performance.
3.0.2 will be about 2 times faster than 3.0.1.
* Improves CSV parse performance for complex case.
3.0.2 will be about 2 times faster than 3.0.1.
### Fixes
* Fixed a parse error bug for new line only input with `headers` option.
[GitHub#53][Reported by Chris Beer]
* Fixed some typos in document.
[GitHub#54][Patch by Victor Shepelev]
### Thanks
* 284km
* Chris Beer
* Victor Shepelev
## 3.0.1 - 2018-12-07
### Improvements
* Added a test.
[GitHub#38][Patch by 284km]
* `CSV::Row#dup`: Changed to duplicate internal data.
[GitHub#39][Reported by André Guimarães Sakata]
* Documented `:nil_value` and `:empty_value` options.
[GitHub#41][Patch by OwlWorks]
* Added support for separator detection for non-seekable inputs.
[GitHub#45][Patch by Ilmari Karonen]
* Removed needless code.
[GitHub#48][Patch by Espartaco Palma]
* Added support for parsing header only CSV with `headers: true`.
[GitHub#47][Patch by Kazuma Shibasaka]
* Added support for coverage report in CI.
[GitHub#48][Patch by Espartaco Palma]
* Improved auto CR row separator detection.
[GitHub#51][Reported by Yuki Kurihara]
### Fixes
* Fixed a typo in document.
[GitHub#40][Patch by Marcus Stollsteimer]
### Thanks
* 284km
* André Guimarães Sakata
* Marcus Stollsteimer
* OwlWorks
* Ilmari Karonen
* Espartaco Palma
* Kazuma Shibasaka
* Yuki Kurihara
## 3.0.0 - 2018-06-06
### Fixes
* Fixed a bug that header isn't returned for empty row.
[GitHub#37][Patch by Grace Lee]
### Thanks
* Grace Lee
## 1.0.2 - 2018-05-03
### Improvements
* Split file for CSV::VERSION
* Code cleanup: Split csv.rb into a more manageable structure
[GitHub#19][Patch by Espartaco Palma]
[GitHub#20][Patch by Steven Daniels]
* Use CSV::MalformedCSVError for invalid encoding line
[GitHub#26][Reported by deepj]
* Support implicit Row <-> Array conversion
[Bug #10013][ruby-core:63582][Reported by Dawid Janczak]
* Update class docs
[GitHub#32][Patch by zverok]
* Add `Row#each_pair`
[GitHub#33][Patch by zverok]
* Improve CSV performance
[GitHub#30][Patch by Watson]
* Add :nil_value and :empty_value option
### Fixes
* Fix a bug that "bom|utf-8" doesn't work
[GitHub#23][Reported by Pavel Lobashov]
* `CSV::Row#to_h`, `#to_hash`: uses the same value as `Row#[]`
[Bug #14482][Reported by tomoya ishida]
* Make row separator detection more robust
[GitHub#25][Reported by deepj]
* Fix a bug that too much separator when col_sep is `" "`
[Bug #8784][ruby-core:63582][Reported by Sylvain Laperche]
### Thanks
* Espartaco Palma
* Steven Daniels
* deepj
* Dawid Janczak
* zverok
* Watson
* Pavel Lobashov
* tomoya ishida
* Sylvain Laperche
* Ryunosuke Sato
## 1.0.1 - 2018-02-09
### Improvements
* `CSV::Table#delete`: Added bulk delete support. You can delete
multiple rows and columns at once.
[GitHub#4][Patch by Vladislav]
* Updated Gem description.
[GitHub#11][Patch by Marcus Stollsteimer]
* Code cleanup.
[GitHub#12][Patch by Marcus Stollsteimer]
[GitHub#14][Patch by Steven Daniels]
[GitHub#18][Patch by takkanm]
* `CSV::Table#dig`: Added.
[GitHub#15][Patch by Tomohiro Ogoke]
* `CSV::Row#dig`: Added.
[GitHub#15][Patch by Tomohiro Ogoke]
* Added ISO 8601 support to date time converter.
[GitHub#16]
### Fixes
* Fixed wrong `CSV::VERSION`.
[GitHub#10][Reported by Marcus Stollsteimer]
* `CSV.generate`: Fixed a regression bug that `String` argument is
ignored.
[GitHub#13][Patch by pavel]
### Thanks
* Vladislav
* Marcus Stollsteimer
* Steven Daniels
* takkanm
* Tomohiro Ogoke
* pavel
csv-3.3.4/csv.gemspec 0000644 0000041 0000041 00000006120 15000146530 014475 0 ustar www-data www-data #########################################################
# This file has been automatically generated by gem2tgz #
#########################################################
# -*- encoding: utf-8 -*-
# stub: csv 3.3.4 ruby lib
Gem::Specification.new do |s|
s.name = "csv".freeze
s.version = "3.3.4"
s.required_rubygems_version = Gem::Requirement.new(">= 0".freeze) if s.respond_to? :required_rubygems_version=
s.metadata = { "changelog_uri" => "https://github.com/ruby/csv/releases/tag/v3.3.4" } if s.respond_to? :metadata=
s.require_paths = ["lib".freeze]
s.authors = ["James Edward Gray II".freeze, "Kouhei Sutou".freeze]
s.date = "2025-04-13"
s.description = "The CSV library provides a complete interface to CSV files and data. It offers tools to enable you to read and write to and from Strings or IO objects, as needed.".freeze
s.email = [nil, "kou@cozmixng.org".freeze]
s.extra_rdoc_files = ["LICENSE.txt".freeze, "NEWS.md".freeze, "README.md".freeze, "doc/csv/recipes/filtering.rdoc".freeze, "doc/csv/recipes/generating.rdoc".freeze, "doc/csv/recipes/parsing.rdoc".freeze, "doc/csv/recipes/recipes.rdoc".freeze]
s.files = ["LICENSE.txt".freeze, "NEWS.md".freeze, "README.md".freeze, "doc/csv/arguments/io.rdoc".freeze, "doc/csv/options/common/col_sep.rdoc".freeze, "doc/csv/options/common/quote_char.rdoc".freeze, "doc/csv/options/common/row_sep.rdoc".freeze, "doc/csv/options/generating/force_quotes.rdoc".freeze, "doc/csv/options/generating/quote_empty.rdoc".freeze, "doc/csv/options/generating/write_converters.rdoc".freeze, "doc/csv/options/generating/write_empty_value.rdoc".freeze, "doc/csv/options/generating/write_headers.rdoc".freeze, "doc/csv/options/generating/write_nil_value.rdoc".freeze, "doc/csv/options/parsing/converters.rdoc".freeze, "doc/csv/options/parsing/empty_value.rdoc".freeze, "doc/csv/options/parsing/field_size_limit.rdoc".freeze, "doc/csv/options/parsing/header_converters.rdoc".freeze, "doc/csv/options/parsing/headers.rdoc".freeze, "doc/csv/options/parsing/liberal_parsing.rdoc".freeze, "doc/csv/options/parsing/nil_value.rdoc".freeze, "doc/csv/options/parsing/return_headers.rdoc".freeze, "doc/csv/options/parsing/skip_blanks.rdoc".freeze, "doc/csv/options/parsing/skip_lines.rdoc".freeze, "doc/csv/options/parsing/strip.rdoc".freeze, "doc/csv/options/parsing/unconverted_fields.rdoc".freeze, "doc/csv/recipes/filtering.rdoc".freeze, "doc/csv/recipes/generating.rdoc".freeze, "doc/csv/recipes/parsing.rdoc".freeze, "doc/csv/recipes/recipes.rdoc".freeze, "lib/csv.rb".freeze, "lib/csv/core_ext/array.rb".freeze, "lib/csv/core_ext/string.rb".freeze, "lib/csv/fields_converter.rb".freeze, "lib/csv/input_record_separator.rb".freeze, "lib/csv/parser.rb".freeze, "lib/csv/row.rb".freeze, "lib/csv/table.rb".freeze, "lib/csv/version.rb".freeze, "lib/csv/writer.rb".freeze]
s.homepage = "https://github.com/ruby/csv".freeze
s.licenses = ["Ruby".freeze, "BSD-2-Clause".freeze]
s.rdoc_options = ["--main".freeze, "README.md".freeze]
s.required_ruby_version = Gem::Requirement.new(">= 2.5.0".freeze)
s.rubygems_version = "3.3.15".freeze
s.summary = "CSV Reading and Writing".freeze
end
csv-3.3.4/doc/ 0000755 0000041 0000041 00000000000 15000146530 013103 5 ustar www-data www-data csv-3.3.4/doc/csv/ 0000755 0000041 0000041 00000000000 15000146530 013676 5 ustar www-data www-data csv-3.3.4/doc/csv/options/ 0000755 0000041 0000041 00000000000 15000146530 015371 5 ustar www-data www-data csv-3.3.4/doc/csv/options/common/ 0000755 0000041 0000041 00000000000 15000146530 016661 5 ustar www-data www-data csv-3.3.4/doc/csv/options/common/quote_char.rdoc 0000644 0000041 0000041 00000002555 15000146530 021673 0 ustar www-data www-data ====== Option +quote_char+
Specifies the character (\String of length 1) used used to quote fields
in both parsing and generating.
This String will be transcoded into the data's \Encoding before use.
Default value:
CSV::DEFAULT_OPTIONS.fetch(:quote_char) # => "\"" (double quote)
This is useful for an application that incorrectly uses ' (single-quote)
to quote fields, instead of the correct " (double-quote).
Using the default (double quote):
str = CSV.generate do |csv|
csv << ['foo', 0]
csv << ["'bar'", 1]
csv << ['"baz"', 2]
end
str # => "foo,0\n'bar',1\n\"\"\"baz\"\"\",2\n"
ary = CSV.parse(str)
ary # => [["foo", "0"], ["'bar'", "1"], ["\"baz\"", "2"]]
Using ' (single-quote):
quote_char = "'"
str = CSV.generate(quote_char: quote_char) do |csv|
csv << ['foo', 0]
csv << ["'bar'", 1]
csv << ['"baz"', 2]
end
str # => "foo,0\n'''bar''',1\n\"baz\",2\n"
ary = CSV.parse(str, quote_char: quote_char)
ary # => [["foo", "0"], ["'bar'", "1"], ["\"baz\"", "2"]]
---
Raises an exception if the \String length is greater than 1:
# Raises ArgumentError (:quote_char has to be nil or a single character String)
CSV.new('', quote_char: 'xx')
Raises an exception if the value is not a \String:
# Raises ArgumentError (:quote_char has to be nil or a single character String)
CSV.new('', quote_char: :foo)
csv-3.3.4/doc/csv/options/common/row_sep.rdoc 0000644 0000041 0000041 00000005344 15000146530 021216 0 ustar www-data www-data ====== Option +row_sep+
Specifies the row separator, a \String or the \Symbol :auto (see below),
to be used for both parsing and generating.
Default value:
CSV::DEFAULT_OPTIONS.fetch(:row_sep) # => :auto
---
When +row_sep+ is a \String, that \String becomes the row separator.
The String will be transcoded into the data's Encoding before use.
Using "\n":
row_sep = "\n"
str = CSV.generate(row_sep: row_sep) do |csv|
csv << [:foo, 0]
csv << [:bar, 1]
csv << [:baz, 2]
end
str # => "foo,0\nbar,1\nbaz,2\n"
ary = CSV.parse(str)
ary # => [["foo", "0"], ["bar", "1"], ["baz", "2"]]
Using | (pipe):
row_sep = '|'
str = CSV.generate(row_sep: row_sep) do |csv|
csv << [:foo, 0]
csv << [:bar, 1]
csv << [:baz, 2]
end
str # => "foo,0|bar,1|baz,2|"
ary = CSV.parse(str, row_sep: row_sep)
ary # => [["foo", "0"], ["bar", "1"], ["baz", "2"]]
Using -- (two hyphens):
row_sep = '--'
str = CSV.generate(row_sep: row_sep) do |csv|
csv << [:foo, 0]
csv << [:bar, 1]
csv << [:baz, 2]
end
str # => "foo,0--bar,1--baz,2--"
ary = CSV.parse(str, row_sep: row_sep)
ary # => [["foo", "0"], ["bar", "1"], ["baz", "2"]]
Using '' (empty string):
row_sep = ''
str = CSV.generate(row_sep: row_sep) do |csv|
csv << [:foo, 0]
csv << [:bar, 1]
csv << [:baz, 2]
end
str # => "foo,0bar,1baz,2"
ary = CSV.parse(str, row_sep: row_sep)
ary # => [["foo", "0bar", "1baz", "2"]]
---
When +row_sep+ is the \Symbol +:auto+ (the default),
generating uses "\n" as the row separator:
str = CSV.generate do |csv|
csv << [:foo, 0]
csv << [:bar, 1]
csv << [:baz, 2]
end
str # => "foo,0\nbar,1\nbaz,2\n"
Parsing, on the other hand, invokes auto-discovery of the row separator.
Auto-discovery reads ahead in the data looking for the next \r\n, +\n+, or +\r+ sequence.
The sequence will be selected even if it occurs in a quoted field,
assuming that you would have the same line endings there.
Example:
str = CSV.generate do |csv|
csv << [:foo, 0]
csv << [:bar, 1]
csv << [:baz, 2]
end
str # => "foo,0\nbar,1\nbaz,2\n"
ary = CSV.parse(str)
ary # => [["foo", "0"], ["bar", "1"], ["baz", "2"]]
The default $INPUT_RECORD_SEPARATOR ($/) is used
if any of the following is true:
* None of those sequences is found.
* Data is +ARGF+, +STDIN+, +STDOUT+, or +STDERR+.
* The stream is only available for output.
Obviously, discovery takes a little time. Set manually if speed is important. Also note that IO objects should be opened in binary mode on Windows if this feature will be used as the line-ending translation can cause problems with resetting the document position to where it was before the read ahead.
csv-3.3.4/doc/csv/options/common/col_sep.rdoc 0000644 0000041 0000041 00000002665 15000146530 021167 0 ustar www-data www-data ====== Option +col_sep+
Specifies the \String column separator to be used
for both parsing and generating.
The \String will be transcoded into the data's \Encoding before use.
Default value:
CSV::DEFAULT_OPTIONS.fetch(:col_sep) # => "," (comma)
Using the default (comma):
str = CSV.generate do |csv|
csv << [:foo, 0]
csv << [:bar, 1]
csv << [:baz, 2]
end
str # => "foo,0\nbar,1\nbaz,2\n"
ary = CSV.parse(str)
ary # => [["foo", "0"], ["bar", "1"], ["baz", "2"]]
Using +:+ (colon):
col_sep = ':'
str = CSV.generate(col_sep: col_sep) do |csv|
csv << [:foo, 0]
csv << [:bar, 1]
csv << [:baz, 2]
end
str # => "foo:0\nbar:1\nbaz:2\n"
ary = CSV.parse(str, col_sep: col_sep)
ary # => [["foo", "0"], ["bar", "1"], ["baz", "2"]]
Using +::+ (two colons):
col_sep = '::'
str = CSV.generate(col_sep: col_sep) do |csv|
csv << [:foo, 0]
csv << [:bar, 1]
csv << [:baz, 2]
end
str # => "foo::0\nbar::1\nbaz::2\n"
ary = CSV.parse(str, col_sep: col_sep)
ary # => [["foo", "0"], ["bar", "1"], ["baz", "2"]]
Using '' (empty string):
col_sep = ''
str = CSV.generate(col_sep: col_sep) do |csv|
csv << [:foo, 0]
csv << [:bar, 1]
csv << [:baz, 2]
end
str # => "foo0\nbar1\nbaz2\n"
---
Raises an exception if parsing with the empty \String:
col_sep = ''
# Raises ArgumentError (:col_sep must be 1 or more characters: "")
CSV.parse("foo0\nbar1\nbaz2\n", col_sep: col_sep)
csv-3.3.4/doc/csv/options/generating/ 0000755 0000041 0000041 00000000000 15000146530 017514 5 ustar www-data www-data csv-3.3.4/doc/csv/options/generating/force_quotes.rdoc 0000644 0000041 0000041 00000000652 15000146530 023066 0 ustar www-data www-data ====== Option +force_quotes+
Specifies the boolean that determines whether each output field is to be double-quoted.
Default value:
CSV::DEFAULT_OPTIONS.fetch(:force_quotes) # => false
For examples in this section:
ary = ['foo', 0, nil]
Using the default, +false+:
str = CSV.generate_line(ary)
str # => "foo,0,\n"
Using +true+:
str = CSV.generate_line(ary, force_quotes: true)
str # => "\"foo\",\"0\",\"\"\n"
csv-3.3.4/doc/csv/options/generating/quote_empty.rdoc 0000644 0000041 0000041 00000000536 15000146530 022744 0 ustar www-data www-data ====== Option +quote_empty+
Specifies the boolean that determines whether an empty value is to be double-quoted.
Default value:
CSV::DEFAULT_OPTIONS.fetch(:quote_empty) # => true
With the default +true+:
CSV.generate_line(['"', ""]) # => "\"\"\"\",\"\"\n"
With +false+:
CSV.generate_line(['"', ""], quote_empty: false) # => "\"\"\"\",\n"
csv-3.3.4/doc/csv/options/generating/write_empty_value.rdoc 0000644 0000041 0000041 00000000624 15000146530 024133 0 ustar www-data www-data ====== Option +write_empty_value+
Specifies the object that is to be substituted for each field
that has an empty \String.
Default value:
CSV::DEFAULT_OPTIONS.fetch(:write_empty_value) # => ""
Without the option:
str = CSV.generate_line(['a', '', 'c', ''])
str # => "a,\"\",c,\"\"\n"
With the option:
str = CSV.generate_line(['a', '', 'c', ''], write_empty_value: "x")
str # => "a,x,c,x\n"
csv-3.3.4/doc/csv/options/generating/write_headers.rdoc 0000644 0000041 0000041 00000001251 15000146530 023211 0 ustar www-data www-data ====== Option +write_headers+
Specifies the boolean that determines whether a header row is included in the output;
ignored if there are no headers.
Default value:
CSV::DEFAULT_OPTIONS.fetch(:write_headers) # => nil
Without +write_headers+:
file_path = 't.csv'
CSV.open(file_path,'w',
:headers => ['Name','Value']
) do |csv|
csv << ['foo', '0']
end
CSV.open(file_path) do |csv|
csv.shift
end # => ["foo", "0"]
With +write_headers+":
CSV.open(file_path,'w',
:write_headers => true,
:headers => ['Name','Value']
) do |csv|
csv << ['foo', '0']
end
CSV.open(file_path) do |csv|
csv.shift
end # => ["Name", "Value"]
csv-3.3.4/doc/csv/options/generating/write_nil_value.rdoc 0000644 0000041 0000041 00000000576 15000146530 023565 0 ustar www-data www-data ====== Option +write_nil_value+
Specifies the object that is to be substituted for each +nil+-valued field.
Default value:
CSV::DEFAULT_OPTIONS.fetch(:write_nil_value) # => nil
Without the option:
str = CSV.generate_line(['a', nil, 'c', nil])
str # => "a,,c,\n"
With the option:
str = CSV.generate_line(['a', nil, 'c', nil], write_nil_value: "x")
str # => "a,x,c,x\n"
csv-3.3.4/doc/csv/options/generating/write_converters.rdoc 0000644 0000041 0000041 00000001571 15000146530 023775 0 ustar www-data www-data ====== Option +write_converters+
Specifies converters to be used in generating fields.
See {Write Converters}[#class-CSV-label-Write+Converters]
Default value:
CSV::DEFAULT_OPTIONS.fetch(:write_converters) # => nil
With no write converter:
str = CSV.generate_line(["\na\n", "\tb\t", " c "])
str # => "\"\na\n\",\tb\t, c \n"
With a write converter:
strip_converter = proc {|field| field.strip }
str = CSV.generate_line(["\na\n", "\tb\t", " c "], write_converters: strip_converter)
str # => "a,b,c\n"
With two write converters (called in order):
upcase_converter = proc {|field| field.upcase }
downcase_converter = proc {|field| field.downcase }
write_converters = [upcase_converter, downcase_converter]
str = CSV.generate_line(['a', 'b', 'c'], write_converters: write_converters)
str # => "a,b,c\n"
See also {Write Converters}[#class-CSV-label-Write+Converters]
csv-3.3.4/doc/csv/options/parsing/ 0000755 0000041 0000041 00000000000 15000146530 017034 5 ustar www-data www-data csv-3.3.4/doc/csv/options/parsing/headers.rdoc 0000644 0000041 0000041 00000002700 15000146530 021317 0 ustar www-data www-data ====== Option +headers+
Specifies a boolean, \Symbol, \Array, or \String to be used
to define column headers.
Default value:
CSV::DEFAULT_OPTIONS.fetch(:headers) # => false
---
Without +headers+:
str = <<-EOT
Name,Count
foo,0
bar,1
bax,2
EOT
csv = CSV.new(str)
csv # => #
csv.headers # => nil
csv.shift # => ["Name", "Count"]
---
If set to +true+ or the \Symbol +:first_row+,
the first row of the data is treated as a row of headers:
str = <<-EOT
Name,Count
foo,0
bar,1
bax,2
EOT
csv = CSV.new(str, headers: true)
csv # => #
csv.headers # => ["Name", "Count"]
csv.shift # => #
---
If set to an \Array, the \Array elements are treated as headers:
str = <<-EOT
foo,0
bar,1
bax,2
EOT
csv = CSV.new(str, headers: ['Name', 'Count'])
csv
csv.headers # => ["Name", "Count"]
csv.shift # => #
---
If set to a \String +str+, method CSV::parse_line(str, options) is called
with the current +options+, and the returned \Array is treated as headers:
str = <<-EOT
foo,0
bar,1
bax,2
EOT
csv = CSV.new(str, headers: 'Name,Count')
csv
csv.headers # => ["Name", "Count"]
csv.shift # => #
csv-3.3.4/doc/csv/options/parsing/field_size_limit.rdoc 0000644 0000041 0000041 00000002247 15000146530 023225 0 ustar www-data www-data ====== Option +field_size_limit+
Specifies the \Integer field size limit.
Default value:
CSV::DEFAULT_OPTIONS.fetch(:field_size_limit) # => nil
This is a maximum size CSV will read ahead looking for the closing quote for a field.
(In truth, it reads to the first line ending beyond this size.)
If a quote cannot be found within the limit CSV will raise a MalformedCSVError,
assuming the data is faulty.
You can use this limit to prevent what are effectively DoS attacks on the parser.
However, this limit can cause a legitimate parse to fail;
therefore the default value is +nil+ (no limit).
For the examples in this section:
str = <<~EOT
"a","b"
"
2345
",""
EOT
str # => "\"a\",\"b\"\n\"\n2345\n\",\"\"\n"
Using the default +nil+:
ary = CSV.parse(str)
ary # => [["a", "b"], ["\n2345\n", ""]]
Using 50:
field_size_limit = 50
ary = CSV.parse(str, field_size_limit: field_size_limit)
ary # => [["a", "b"], ["\n2345\n", ""]]
---
Raises an exception if a field is too long:
big_str = "123456789\n" * 1024
# Raises CSV::MalformedCSVError (Field size exceeded in line 1.)
CSV.parse('valid,fields,"' + big_str + '"', field_size_limit: 2048)
csv-3.3.4/doc/csv/options/parsing/converters.rdoc 0000644 0000041 0000041 00000002730 15000146530 022101 0 ustar www-data www-data ====== Option +converters+
Specifies converters to be used in parsing fields.
See {Field Converters}[#class-CSV-label-Field+Converters]
Default value:
CSV::DEFAULT_OPTIONS.fetch(:converters) # => nil
The value may be a field converter name
(see {Stored Converters}[#class-CSV-label-Stored+Converters]):
str = '1,2,3'
# Without a converter
array = CSV.parse_line(str)
array # => ["1", "2", "3"]
# With built-in converter :integer
array = CSV.parse_line(str, converters: :integer)
array # => [1, 2, 3]
The value may be a converter list
(see {Converter Lists}[#class-CSV-label-Converter+Lists]):
str = '1,3.14159'
# Without converters
array = CSV.parse_line(str)
array # => ["1", "3.14159"]
# With built-in converters
array = CSV.parse_line(str, converters: [:integer, :float])
array # => [1, 3.14159]
The value may be a \Proc custom converter:
(see {Custom Field Converters}[#class-CSV-label-Custom+Field+Converters]):
str = ' foo , bar , baz '
# Without a converter
array = CSV.parse_line(str)
array # => [" foo ", " bar ", " baz "]
# With a custom converter
array = CSV.parse_line(str, converters: proc {|field| field.strip })
array # => ["foo", "bar", "baz"]
See also {Custom Field Converters}[#class-CSV-label-Custom+Field+Converters]
---
Raises an exception if the converter is not a converter name or a \Proc:
str = 'foo,0'
# Raises NoMethodError (undefined method `arity' for nil:NilClass)
CSV.parse(str, converters: :foo)
csv-3.3.4/doc/csv/options/parsing/unconverted_fields.rdoc 0000644 0000041 0000041 00000001720 15000146530 023567 0 ustar www-data www-data ====== Option +unconverted_fields+
Specifies the boolean that determines whether unconverted field values are to be available.
Default value:
CSV::DEFAULT_OPTIONS.fetch(:unconverted_fields) # => nil
The unconverted field values are those found in the source data,
prior to any conversions performed via option +converters+.
When option +unconverted_fields+ is +true+,
each returned row (\Array or \CSV::Row) has an added method,
+unconverted_fields+, that returns the unconverted field values:
str = <<-EOT
foo,0
bar,1
baz,2
EOT
# Without unconverted_fields
csv = CSV.parse(str, converters: :integer)
csv # => [["foo", 0], ["bar", 1], ["baz", 2]]
csv.first.respond_to?(:unconverted_fields) # => false
# With unconverted_fields
csv = CSV.parse(str, converters: :integer, unconverted_fields: true)
csv # => [["foo", 0], ["bar", 1], ["baz", 2]]
csv.first.respond_to?(:unconverted_fields) # => true
csv.first.unconverted_fields # => ["foo", "0"]
csv-3.3.4/doc/csv/options/parsing/nil_value.rdoc 0000644 0000041 0000041 00000000537 15000146530 021670 0 ustar www-data www-data ====== Option +nil_value+
Specifies the object that is to be substituted for each null (no-text) field.
Default value:
CSV::DEFAULT_OPTIONS.fetch(:nil_value) # => nil
With the default, +nil+:
CSV.parse_line('a,,b,,c') # => ["a", nil, "b", nil, "c"]
With a different object:
CSV.parse_line('a,,b,,c', nil_value: 0) # => ["a", 0, "b", 0, "c"]
csv-3.3.4/doc/csv/options/parsing/strip.rdoc 0000644 0000041 0000041 00000000553 15000146530 021051 0 ustar www-data www-data ====== Option +strip+
Specifies the boolean value that determines whether
whitespace is stripped from each input field.
Default value:
CSV::DEFAULT_OPTIONS.fetch(:strip) # => false
With default value +false+:
ary = CSV.parse_line(' a , b ')
ary # => [" a ", " b "]
With value +true+:
ary = CSV.parse_line(' a , b ', strip: true)
ary # => ["a", "b"]
csv-3.3.4/doc/csv/options/parsing/liberal_parsing.rdoc 0000644 0000041 0000041 00000002777 15000146530 023057 0 ustar www-data www-data ====== Option +liberal_parsing+
Specifies the boolean or hash value that determines whether
CSV will attempt to parse input not conformant with RFC 4180,
such as double quotes in unquoted fields.
Default value:
CSV::DEFAULT_OPTIONS.fetch(:liberal_parsing) # => false
For the next two examples:
str = 'is,this "three, or four",fields'
Without +liberal_parsing+:
# Raises CSV::MalformedCSVError (Illegal quoting in str 1.)
CSV.parse_line(str)
With +liberal_parsing+:
ary = CSV.parse_line(str, liberal_parsing: true)
ary # => ["is", "this \"three", " or four\"", "fields"]
Use the +backslash_quote+ sub-option to parse values that use
a backslash to escape a double-quote character. This
causes the parser to treat \"
as if it were
""
.
For the next two examples:
str = 'Show,"Harry \"Handcuff\" Houdini, the one and only","Tampa Theater"'
With +liberal_parsing+, but without the +backslash_quote+ sub-option:
# Incorrect interpretation of backslash; incorrectly interprets the quoted comma as a field separator.
ary = CSV.parse_line(str, liberal_parsing: true)
ary # => ["Show", "\"Harry \\\"Handcuff\\\" Houdini", " the one and only\"", "Tampa Theater"]
puts ary[1] # => "Harry \"Handcuff\" Houdini
With +liberal_parsing+ and its +backslash_quote+ sub-option:
ary = CSV.parse_line(str, liberal_parsing: { backslash_quote: true })
ary # => ["Show", "Harry \"Handcuff\" Houdini, the one and only", "Tampa Theater"]
puts ary[1] # => Harry "Handcuff" Houdini, the one and only
csv-3.3.4/doc/csv/options/parsing/empty_value.rdoc 0000644 0000041 0000041 00000000620 15000146530 022235 0 ustar www-data www-data ====== Option +empty_value+
Specifies the object that is to be substituted
for each field that has an empty \String.
Default value:
CSV::DEFAULT_OPTIONS.fetch(:empty_value) # => "" (empty string)
With the default, "":
CSV.parse_line('a,"",b,"",c') # => ["a", "", "b", "", "c"]
With a different object:
CSV.parse_line('a,"",b,"",c', empty_value: 'x') # => ["a", "x", "b", "x", "c"]
csv-3.3.4/doc/csv/options/parsing/skip_blanks.rdoc 0000644 0000041 0000041 00000001420 15000146530 022202 0 ustar www-data www-data ====== Option +skip_blanks+
Specifies a boolean that determines whether blank lines in the input will be ignored;
a line that contains a column separator is not considered to be blank.
Default value:
CSV::DEFAULT_OPTIONS.fetch(:skip_blanks) # => false
See also option {skiplines}[#class-CSV-label-Option+skip_lines].
For examples in this section:
str = <<-EOT
foo,0
bar,1
baz,2
,
EOT
Using the default, +false+:
ary = CSV.parse(str)
ary # => [["foo", "0"], [], ["bar", "1"], ["baz", "2"], [], [nil, nil]]
Using +true+:
ary = CSV.parse(str, skip_blanks: true)
ary # => [["foo", "0"], ["bar", "1"], ["baz", "2"], [nil, nil]]
Using a truthy value:
ary = CSV.parse(str, skip_blanks: :foo)
ary # => [["foo", "0"], ["bar", "1"], ["baz", "2"], [nil, nil]]
csv-3.3.4/doc/csv/options/parsing/return_headers.rdoc 0000644 0000041 0000041 00000001070 15000146530 022715 0 ustar www-data www-data ====== Option +return_headers+
Specifies the boolean that determines whether method #shift
returns or ignores the header row.
Default value:
CSV::DEFAULT_OPTIONS.fetch(:return_headers) # => false
Examples:
str = <<-EOT
Name,Count
foo,0
bar,1
bax,2
EOT
# Without return_headers first row is str.
csv = CSV.new(str, headers: true)
csv.shift # => #
# With return_headers first row is headers.
csv = CSV.new(str, headers: true, return_headers: true)
csv.shift # => #
csv-3.3.4/doc/csv/options/parsing/skip_lines.rdoc 0000644 0000041 0000041 00000002074 15000146530 022050 0 ustar www-data www-data ====== Option +skip_lines+
Specifies an object to use in identifying comment lines in the input that are to be ignored:
* If a \Regexp, ignores lines that match it.
* If a \String, converts it to a \Regexp, ignores lines that match it.
* If +nil+, no lines are considered to be comments.
Default value:
CSV::DEFAULT_OPTIONS.fetch(:skip_lines) # => nil
For examples in this section:
str = <<-EOT
# Comment
foo,0
bar,1
baz,2
# Another comment
EOT
str # => "# Comment\nfoo,0\nbar,1\nbaz,2\n# Another comment\n"
Using the default, +nil+:
ary = CSV.parse(str)
ary # => [["# Comment"], ["foo", "0"], ["bar", "1"], ["baz", "2"], ["# Another comment"]]
Using a \Regexp:
ary = CSV.parse(str, skip_lines: /^#/)
ary # => [["foo", "0"], ["bar", "1"], ["baz", "2"]]
Using a \String:
ary = CSV.parse(str, skip_lines: '#')
ary # => [["foo", "0"], ["bar", "1"], ["baz", "2"]]
---
Raises an exception if given an object that is not a \Regexp, a \String, or +nil+:
# Raises ArgumentError (:skip_lines has to respond to #match: 0)
CSV.parse(str, skip_lines: 0)
csv-3.3.4/doc/csv/options/parsing/header_converters.rdoc 0000644 0000041 0000041 00000002701 15000146530 023407 0 ustar www-data www-data ====== Option +header_converters+
Specifies converters to be used in parsing headers.
See {Header Converters}[#class-CSV-label-Header+Converters]
Default value:
CSV::DEFAULT_OPTIONS.fetch(:header_converters) # => nil
Identical in functionality to option {converters}[#class-CSV-label-Option+converters]
except that:
- The converters apply only to the header row.
- The built-in header converters are +:downcase+ and +:symbol+.
This section assumes prior execution of:
str = <<-EOT
Name,Value
foo,0
bar,1
baz,2
EOT
# With no header converter
table = CSV.parse(str, headers: true)
table.headers # => ["Name", "Value"]
The value may be a header converter name
(see {Stored Converters}[#class-CSV-label-Stored+Converters]):
table = CSV.parse(str, headers: true, header_converters: :downcase)
table.headers # => ["name", "value"]
The value may be a converter list
(see {Converter Lists}[#class-CSV-label-Converter+Lists]):
header_converters = [:downcase, :symbol]
table = CSV.parse(str, headers: true, header_converters: header_converters)
table.headers # => [:name, :value]
The value may be a \Proc custom converter
(see {Custom Header Converters}[#class-CSV-label-Custom+Header+Converters]):
upcase_converter = proc {|field| field.upcase }
table = CSV.parse(str, headers: true, header_converters: upcase_converter)
table.headers # => ["NAME", "VALUE"]
See also {Custom Header Converters}[#class-CSV-label-Custom+Header+Converters]
csv-3.3.4/doc/csv/recipes/ 0000755 0000041 0000041 00000000000 15000146530 015330 5 ustar www-data www-data csv-3.3.4/doc/csv/recipes/parsing.rdoc 0000644 0000041 0000041 00000053564 15000146530 017661 0 ustar www-data www-data == Recipes for Parsing \CSV
These recipes are specific code examples for specific \CSV parsing tasks.
For other recipes, see {Recipes for CSV}[./recipes_rdoc.html].
All code snippets on this page assume that the following has been executed:
require 'csv'
=== Contents
- {Source Formats}[#label-Source+Formats]
- {Parsing from a String}[#label-Parsing+from+a+String]
- {Recipe: Parse from String with Headers}[#label-Recipe-3A+Parse+from+String+with+Headers]
- {Recipe: Parse from String Without Headers}[#label-Recipe-3A+Parse+from+String+Without+Headers]
- {Parsing from a File}[#label-Parsing+from+a+File]
- {Recipe: Parse from File with Headers}[#label-Recipe-3A+Parse+from+File+with+Headers]
- {Recipe: Parse from File Without Headers}[#label-Recipe-3A+Parse+from+File+Without+Headers]
- {Parsing from an IO Stream}[#label-Parsing+from+an+IO+Stream]
- {Recipe: Parse from IO Stream with Headers}[#label-Recipe-3A+Parse+from+IO+Stream+with+Headers]
- {Recipe: Parse from IO Stream Without Headers}[#label-Recipe-3A+Parse+from+IO+Stream+Without+Headers]
- {RFC 4180 Compliance}[#label-RFC+4180+Compliance]
- {Row Separator}[#label-Row+Separator]
- {Recipe: Handle Compliant Row Separator}[#label-Recipe-3A+Handle+Compliant+Row+Separator]
- {Recipe: Handle Non-Compliant Row Separator}[#label-Recipe-3A+Handle+Non-Compliant+Row+Separator]
- {Column Separator}[#label-Column+Separator]
- {Recipe: Handle Compliant Column Separator}[#label-Recipe-3A+Handle+Compliant+Column+Separator]
- {Recipe: Handle Non-Compliant Column Separator}[#label-Recipe-3A+Handle+Non-Compliant+Column+Separator]
- {Quote Character}[#label-Quote+Character]
- {Recipe: Handle Compliant Quote Character}[#label-Recipe-3A+Handle+Compliant+Quote+Character]
- {Recipe: Handle Non-Compliant Quote Character}[#label-Recipe-3A+Handle+Non-Compliant+Quote+Character]
- {Recipe: Allow Liberal Parsing}[#label-Recipe-3A+Allow+Liberal+Parsing]
- {Special Handling}[#label-Special+Handling]
- {Special Line Handling}[#label-Special+Line+Handling]
- {Recipe: Ignore Blank Lines}[#label-Recipe-3A+Ignore+Blank+Lines]
- {Recipe: Ignore Selected Lines}[#label-Recipe-3A+Ignore+Selected+Lines]
- {Special Field Handling}[#label-Special+Field+Handling]
- {Recipe: Strip Fields}[#label-Recipe-3A+Strip+Fields]
- {Recipe: Handle Null Fields}[#label-Recipe-3A+Handle+Null+Fields]
- {Recipe: Handle Empty Fields}[#label-Recipe-3A+Handle+Empty+Fields]
- {Converting Fields}[#label-Converting+Fields]
- {Converting Fields to Objects}[#label-Converting+Fields+to+Objects]
- {Recipe: Convert Fields to Integers}[#label-Recipe-3A+Convert+Fields+to+Integers]
- {Recipe: Convert Fields to Floats}[#label-Recipe-3A+Convert+Fields+to+Floats]
- {Recipe: Convert Fields to Numerics}[#label-Recipe-3A+Convert+Fields+to+Numerics]
- {Recipe: Convert Fields to Dates}[#label-Recipe-3A+Convert+Fields+to+Dates]
- {Recipe: Convert Fields to DateTimes}[#label-Recipe-3A+Convert+Fields+to+DateTimes]
- {Recipe: Convert Fields to Times}[#label-Recipe-3A+Convert+Fields+to+Times]
- {Recipe: Convert Assorted Fields to Objects}[#label-Recipe-3A+Convert+Assorted+Fields+to+Objects]
- {Recipe: Convert Fields to Other Objects}[#label-Recipe-3A+Convert+Fields+to+Other+Objects]
- {Recipe: Filter Field Strings}[#label-Recipe-3A+Filter+Field+Strings]
- {Recipe: Register Field Converters}[#label-Recipe-3A+Register+Field+Converters]
- {Using Multiple Field Converters}[#label-Using+Multiple+Field+Converters]
- {Recipe: Specify Multiple Field Converters in Option :converters}[#label-Recipe-3A+Specify+Multiple+Field+Converters+in+Option+-3Aconverters]
- {Recipe: Specify Multiple Field Converters in a Custom Converter List}[#label-Recipe-3A+Specify+Multiple+Field+Converters+in+a+Custom+Converter+List]
- {Converting Headers}[#label-Converting+Headers]
- {Recipe: Convert Headers to Lowercase}[#label-Recipe-3A+Convert+Headers+to+Lowercase]
- {Recipe: Convert Headers to Symbols}[#label-Recipe-3A+Convert+Headers+to+Symbols]
- {Recipe: Filter Header Strings}[#label-Recipe-3A+Filter+Header+Strings]
- {Recipe: Register Header Converters}[#label-Recipe-3A+Register+Header+Converters]
- {Using Multiple Header Converters}[#label-Using+Multiple+Header+Converters]
- {Recipe: Specify Multiple Header Converters in Option :header_converters}[#label-Recipe-3A+Specify+Multiple+Header+Converters+in+Option+-3Aheader_converters]
- {Recipe: Specify Multiple Header Converters in a Custom Header Converter List}[#label-Recipe-3A+Specify+Multiple+Header+Converters+in+a+Custom+Header+Converter+List]
- {Diagnostics}[#label-Diagnostics]
- {Recipe: Capture Unconverted Fields}[#label-Recipe-3A+Capture+Unconverted+Fields]
- {Recipe: Capture Field Info}[#label-Recipe-3A+Capture+Field+Info]
=== Source Formats
You can parse \CSV data from a \String, from a \File (via its path), or from an \IO stream.
==== Parsing from a \String
You can parse \CSV data from a \String, with or without headers.
===== Recipe: Parse from \String with Headers
Use class method CSV.parse with option +headers+ to read a source \String all at once
(may have memory resource implications):
string = "Name,Value\nfoo,0\nbar,1\nbaz,2\n"
CSV.parse(string, headers: true) # => #
Use instance method CSV#each with option +headers+ to read a source \String one row at a time:
CSV.new(string, headers: true).each do |row|
p row
end
Output:
#
#
#
===== Recipe: Parse from \String Without Headers
Use class method CSV.parse without option +headers+ to read a source \String all at once
(may have memory resource implications):
string = "foo,0\nbar,1\nbaz,2\n"
CSV.parse(string) # => [["foo", "0"], ["bar", "1"], ["baz", "2"]]
Use instance method CSV#each without option +headers+ to read a source \String one row at a time:
CSV.new(string).each do |row|
p row
end
Output:
["foo", "0"]
["bar", "1"]
["baz", "2"]
==== Parsing from a \File
You can parse \CSV data from a \File, with or without headers.
===== Recipe: Parse from \File with Headers
Use class method CSV.read with option +headers+ to read a file all at once:
string = "Name,Value\nfoo,0\nbar,1\nbaz,2\n"
path = 't.csv'
File.write(path, string)
CSV.read(path, headers: true) # => #
Use class method CSV.foreach with option +headers+ to read one row at a time:
CSV.foreach(path, headers: true) do |row|
p row
end
Output:
#
#
#
===== Recipe: Parse from \File Without Headers
Use class method CSV.read without option +headers+ to read a file all at once:
string = "foo,0\nbar,1\nbaz,2\n"
path = 't.csv'
File.write(path, string)
CSV.read(path) # => [["foo", "0"], ["bar", "1"], ["baz", "2"]]
Use class method CSV.foreach without option +headers+ to read one row at a time:
CSV.foreach(path) do |row|
p row
end
Output:
["foo", "0"]
["bar", "1"]
["baz", "2"]
==== Parsing from an \IO Stream
You can parse \CSV data from an \IO stream, with or without headers.
===== Recipe: Parse from \IO Stream with Headers
Use class method CSV.parse with option +headers+ to read an \IO stream all at once:
string = "Name,Value\nfoo,0\nbar,1\nbaz,2\n"
path = 't.csv'
File.write(path, string)
File.open(path) do |file|
CSV.parse(file, headers: true)
end # => #
Use class method CSV.foreach with option +headers+ to read one row at a time:
File.open(path) do |file|
CSV.foreach(file, headers: true) do |row|
p row
end
end
Output:
#
#
#
===== Recipe: Parse from \IO Stream Without Headers
Use class method CSV.parse without option +headers+ to read an \IO stream all at once:
string = "foo,0\nbar,1\nbaz,2\n"
path = 't.csv'
File.write(path, string)
File.open(path) do |file|
CSV.parse(file)
end # => [["foo", "0"], ["bar", "1"], ["baz", "2"]]
Use class method CSV.foreach without option +headers+ to read one row at a time:
File.open(path) do |file|
CSV.foreach(file) do |row|
p row
end
end
Output:
["foo", "0"]
["bar", "1"]
["baz", "2"]
=== RFC 4180 Compliance
By default, \CSV parses data that is compliant with
{RFC 4180}[https://www.rfc-editor.org/rfc/rfc4180]
with respect to:
- Row separator.
- Column separator.
- Quote character.
==== Row Separator
RFC 4180 specifies the row separator CRLF (Ruby "\r\n").
Although the \CSV default row separator is "\n",
the parser also by default handles row separator "\r" and the RFC-compliant "\r\n".
===== Recipe: Handle Compliant Row Separator
For strict compliance, use option +:row_sep+ to specify row separator "\r\n",
which allows the compliant row separator:
source = "foo,1\r\nbar,1\r\nbaz,2\r\n"
CSV.parse(source, row_sep: "\r\n") # => [["foo", "1"], ["bar", "1"], ["baz", "2"]]
But rejects other row separators:
source = "foo,1\nbar,1\nbaz,2\n"
CSV.parse(source, row_sep: "\r\n") # Raised MalformedCSVError
source = "foo,1\rbar,1\rbaz,2\r"
CSV.parse(source, row_sep: "\r\n") # Raised MalformedCSVError
source = "foo,1\n\rbar,1\n\rbaz,2\n\r"
CSV.parse(source, row_sep: "\r\n") # Raised MalformedCSVError
===== Recipe: Handle Non-Compliant Row Separator
For data with non-compliant row separators, use option +:row_sep+.
This example source uses semicolon (";") as its row separator:
source = "foo,1;bar,1;baz,2;"
CSV.parse(source, row_sep: ';') # => [["foo", "1"], ["bar", "1"], ["baz", "2"]]
==== Column Separator
RFC 4180 specifies column separator COMMA (Ruby ",").
===== Recipe: Handle Compliant Column Separator
Because the \CSV default comma separator is ',',
you need not specify option +:col_sep+ for compliant data:
source = "foo,1\nbar,1\nbaz,2\n"
CSV.parse(source) # => [["foo", "1"], ["bar", "1"], ["baz", "2"]]
===== Recipe: Handle Non-Compliant Column Separator
For data with non-compliant column separators, use option +:col_sep+.
This example source uses TAB ("\t") as its column separator:
source = "foo,1\tbar,1\tbaz,2"
CSV.parse(source, col_sep: "\t") # => [["foo", "1"], ["bar", "1"], ["baz", "2"]]
==== Quote Character
RFC 4180 specifies quote character DQUOTE (Ruby "\"").
===== Recipe: Handle Compliant Quote Character
Because the \CSV default quote character is "\"",
you need not specify option +:quote_char+ for compliant data:
source = "\"foo\",\"1\"\n\"bar\",\"1\"\n\"baz\",\"2\"\n"
CSV.parse(source) # => [["foo", "1"], ["bar", "1"], ["baz", "2"]]
===== Recipe: Handle Non-Compliant Quote Character
For data with non-compliant quote characters, use option +:quote_char+.
This example source uses SQUOTE ("'") as its quote character:
source = "'foo','1'\n'bar','1'\n'baz','2'\n"
CSV.parse(source, quote_char: "'") # => [["foo", "1"], ["bar", "1"], ["baz", "2"]]
==== Recipe: Allow Liberal Parsing
Use option +:liberal_parsing+ to specify that \CSV should
attempt to parse input not conformant with RFC 4180, such as double quotes in unquoted fields:
source = 'is,this "three, or four",fields'
CSV.parse(source) # Raises MalformedCSVError
CSV.parse(source, liberal_parsing: true) # => [["is", "this \"three", " or four\"", "fields"]]
=== Special Handling
You can use parsing options to specify special handling for certain lines and fields.
==== Special Line Handling
Use parsing options to specify special handling for blank lines, or for other selected lines.
===== Recipe: Ignore Blank Lines
Use option +:skip_blanks+ to ignore blank lines:
source = <<-EOT
foo,0
bar,1
baz,2
,
EOT
parsed = CSV.parse(source, skip_blanks: true)
parsed # => [["foo", "0"], ["bar", "1"], ["baz", "2"], [nil, nil]]
===== Recipe: Ignore Selected Lines
Use option +:skip_lines+ to ignore selected lines.
source = <<-EOT
# Comment
foo,0
bar,1
baz,2
# Another comment
EOT
parsed = CSV.parse(source, skip_lines: /^#/)
parsed # => [["foo", "0"], ["bar", "1"], ["baz", "2"]]
==== Special Field Handling
Use parsing options to specify special handling for certain field values.
===== Recipe: Strip Fields
Use option +:strip+ to strip parsed field values:
CSV.parse_line(' a , b ', strip: true) # => ["a", "b"]
===== Recipe: Handle Null Fields
Use option +:nil_value+ to specify a value that will replace each field
that is null (no text):
CSV.parse_line('a,,b,,c', nil_value: 0) # => ["a", 0, "b", 0, "c"]
===== Recipe: Handle Empty Fields
Use option +:empty_value+ to specify a value that will replace each field
that is empty (\String of length 0);
CSV.parse_line('a,"",b,"",c', empty_value: 'x') # => ["a", "x", "b", "x", "c"]
=== Converting Fields
You can use field converters to change parsed \String fields into other objects,
or to otherwise modify the \String fields.
==== Converting Fields to Objects
Use field converters to change parsed \String objects into other, more specific, objects.
There are built-in field converters for converting to objects of certain classes:
- \Float
- \Integer
- \Date
- \DateTime
- \Time
Other built-in field converters include:
- +:numeric+: converts to \Integer and \Float.
- +:all+: converts to \DateTime, \Integer, \Float.
You can also define field converters to convert to objects of other classes.
===== Recipe: Convert Fields to Integers
Convert fields to \Integer objects using built-in converter +:integer+:
source = "Name,Value\nfoo,0\nbar,1\nbaz,2\n"
parsed = CSV.parse(source, headers: true, converters: :integer)
parsed.map {|row| row['Value'].class} # => [Integer, Integer, Integer]
===== Recipe: Convert Fields to Floats
Convert fields to \Float objects using built-in converter +:float+:
source = "Name,Value\nfoo,0\nbar,1\nbaz,2\n"
parsed = CSV.parse(source, headers: true, converters: :float)
parsed.map {|row| row['Value'].class} # => [Float, Float, Float]
===== Recipe: Convert Fields to Numerics
Convert fields to \Integer and \Float objects using built-in converter +:numeric+:
source = "Name,Value\nfoo,0\nbar,1.1\nbaz,2.2\n"
parsed = CSV.parse(source, headers: true, converters: :numeric)
parsed.map {|row| row['Value'].class} # => [Integer, Float, Float]
===== Recipe: Convert Fields to Dates
Convert fields to \Date objects using built-in converter +:date+:
source = "Name,Date\nfoo,2001-02-03\nbar,2001-02-04\nbaz,2001-02-03\n"
parsed = CSV.parse(source, headers: true, converters: :date)
parsed.map {|row| row['Date'].class} # => [Date, Date, Date]
===== Recipe: Convert Fields to DateTimes
Convert fields to \DateTime objects using built-in converter +:date_time+:
source = "Name,DateTime\nfoo,2001-02-03\nbar,2001-02-04\nbaz,2020-05-07T14:59:00-05:00\n"
parsed = CSV.parse(source, headers: true, converters: :date_time)
parsed.map {|row| row['DateTime'].class} # => [DateTime, DateTime, DateTime]
===== Recipe: Convert Fields to Times
Convert fields to \Time objects using built-in converter +:time+:
source = "Name,Time\nfoo,2001-02-03\nbar,2001-02-04\nbaz,2020-05-07T14:59:00-05:00\n"
parsed = CSV.parse(source, headers: true, converters: :time)
parsed.map {|row| row['Time'].class} # => [Time, Time, Time]
===== Recipe: Convert Assorted Fields to Objects
Convert assorted fields to objects using built-in converter +:all+:
source = "Type,Value\nInteger,0\nFloat,1.0\nDateTime,2001-02-04\n"
parsed = CSV.parse(source, headers: true, converters: :all)
parsed.map {|row| row['Value'].class} # => [Integer, Float, DateTime]
===== Recipe: Convert Fields to Other Objects
Define a custom field converter to convert \String fields into other objects.
This example defines and uses a custom field converter
that converts each column-1 value to a \Rational object:
rational_converter = proc do |field, field_context|
field_context.index == 1 ? field.to_r : field
end
source = "Name,Value\nfoo,0\nbar,1\nbaz,2\n"
parsed = CSV.parse(source, headers: true, converters: rational_converter)
parsed.map {|row| row['Value'].class} # => [Rational, Rational, Rational]
==== Recipe: Filter Field Strings
Define a custom field converter to modify \String fields.
This example defines and uses a custom field converter
that strips whitespace from each field value:
strip_converter = proc {|field| field.strip }
source = "Name,Value\n foo , 0 \n bar , 1 \n baz , 2 \n"
parsed = CSV.parse(source, headers: true, converters: strip_converter)
parsed['Name'] # => ["foo", "bar", "baz"]
parsed['Value'] # => ["0", "1", "2"]
==== Recipe: Register Field Converters
Register a custom field converter, assigning it a name;
then refer to the converter by its name:
rational_converter = proc do |field, field_context|
field_context.index == 1 ? field.to_r : field
end
CSV::Converters[:rational] = rational_converter
source = "Name,Value\nfoo,0\nbar,1\nbaz,2\n"
parsed = CSV.parse(source, headers: true, converters: :rational)
parsed['Value'] # => [(0/1), (1/1), (2/1)]
==== Using Multiple Field Converters
You can use multiple field converters in either of these ways:
- Specify converters in option +:converters+.
- Specify converters in a custom converter list.
===== Recipe: Specify Multiple Field Converters in Option +:converters+
Apply multiple field converters by specifying them in option +:converters+:
source = "Name,Value\nfoo,0\nbar,1.0\nbaz,2.0\n"
parsed = CSV.parse(source, headers: true, converters: [:integer, :float])
parsed['Value'] # => [0, 1.0, 2.0]
===== Recipe: Specify Multiple Field Converters in a Custom Converter List
Apply multiple field converters by defining and registering a custom converter list:
strip_converter = proc {|field| field.strip }
CSV::Converters[:strip] = strip_converter
CSV::Converters[:my_converters] = [:integer, :float, :strip]
source = "Name,Value\n foo , 0 \n bar , 1.0 \n baz , 2.0 \n"
parsed = CSV.parse(source, headers: true, converters: :my_converters)
parsed['Name'] # => ["foo", "bar", "baz"]
parsed['Value'] # => [0, 1.0, 2.0]
=== Converting Headers
You can use header converters to modify parsed \String headers.
Built-in header converters include:
- +:symbol+: converts \String header to \Symbol.
- +:downcase+: converts \String header to lowercase.
You can also define header converters to otherwise modify header \Strings.
==== Recipe: Convert Headers to Lowercase
Convert headers to lowercase using built-in converter +:downcase+:
source = "Name,Value\nfoo,0\nbar,1\nbaz,2\n"
parsed = CSV.parse(source, headers: true, header_converters: :downcase)
parsed.headers # => ["name", "value"]
==== Recipe: Convert Headers to Symbols
Convert headers to downcased Symbols using built-in converter +:symbol+:
source = "Name,Value\nfoo,0\nbar,1\nbaz,2\n"
parsed = CSV.parse(source, headers: true, header_converters: :symbol)
parsed.headers # => [:name, :value]
parsed.headers.map {|header| header.class} # => [Symbol, Symbol]
==== Recipe: Filter Header Strings
Define a custom header converter to modify \String fields.
This example defines and uses a custom header converter
that capitalizes each header \String:
capitalize_converter = proc {|header| header.capitalize }
source = "NAME,VALUE\nfoo,0\nbar,1\nbaz,2\n"
parsed = CSV.parse(source, headers: true, header_converters: capitalize_converter)
parsed.headers # => ["Name", "Value"]
==== Recipe: Register Header Converters
Register a custom header converter, assigning it a name;
then refer to the converter by its name:
capitalize_converter = proc {|header| header.capitalize }
CSV::HeaderConverters[:capitalize] = capitalize_converter
source = "NAME,VALUE\nfoo,0\nbar,1\nbaz,2\n"
parsed = CSV.parse(source, headers: true, header_converters: :capitalize)
parsed.headers # => ["Name", "Value"]
==== Using Multiple Header Converters
You can use multiple header converters in either of these ways:
- Specify header converters in option +:header_converters+.
- Specify header converters in a custom header converter list.
===== Recipe: Specify Multiple Header Converters in Option :header_converters
Apply multiple header converters by specifying them in option +:header_converters+:
source = "Name,Value\nfoo,0\nbar,1.0\nbaz,2.0\n"
parsed = CSV.parse(source, headers: true, header_converters: [:downcase, :symbol])
parsed.headers # => [:name, :value]
===== Recipe: Specify Multiple Header Converters in a Custom Header Converter List
Apply multiple header converters by defining and registering a custom header converter list:
CSV::HeaderConverters[:my_header_converters] = [:symbol, :downcase]
source = "NAME,VALUE\nfoo,0\nbar,1.0\nbaz,2.0\n"
parsed = CSV.parse(source, headers: true, header_converters: :my_header_converters)
parsed.headers # => [:name, :value]
=== Diagnostics
==== Recipe: Capture Unconverted Fields
To capture unconverted field values, use option +:unconverted_fields+:
source = "Name,Value\nfoo,0\nbar,1\nbaz,2\n"
parsed = CSV.parse(source, converters: :integer, unconverted_fields: true)
parsed # => [["Name", "Value"], ["foo", 0], ["bar", 1], ["baz", 2]]
parsed.each {|row| p row.unconverted_fields }
Output:
["Name", "Value"]
["foo", "0"]
["bar", "1"]
["baz", "2"]
==== Recipe: Capture Field Info
To capture field info in a custom converter, accept two block arguments.
The first is the field value; the second is a +CSV::FieldInfo+ object:
strip_converter = proc {|field, field_info| p field_info; field.strip }
source = " foo , 0 \n bar , 1 \n baz , 2 \n"
parsed = CSV.parse(source, converters: strip_converter)
parsed # => [["foo", "0"], ["bar", "1"], ["baz", "2"]]
Output:
#
#
#
#
#
#
csv-3.3.4/doc/csv/recipes/generating.rdoc 0000644 0000041 0000041 00000024463 15000146530 020335 0 ustar www-data www-data == Recipes for Generating \CSV
These recipes are specific code examples for specific \CSV generating tasks.
For other recipes, see {Recipes for CSV}[./recipes_rdoc.html].
All code snippets on this page assume that the following has been executed:
require 'csv'
=== Contents
- {Output Formats}[#label-Output+Formats]
- {Generating to a String}[#label-Generating+to+a+String]
- {Recipe: Generate to String with Headers}[#label-Recipe-3A+Generate+to+String+with+Headers]
- {Recipe: Generate to String Without Headers}[#label-Recipe-3A+Generate+to+String+Without+Headers]
- {Generating to a File}[#label-Generating+to+a+File]
- {Recipe: Generate to File with Headers}[#label-Recipe-3A+Generate+to+File+with+Headers]
- {Recipe: Generate to File Without Headers}[#label-Recipe-3A+Generate+to+File+Without+Headers]
- {Generating to IO an Stream}[#label-Generating+to+an+IO+Stream]
- {Recipe: Generate to IO Stream with Headers}[#label-Recipe-3A+Generate+to+IO+Stream+with+Headers]
- {Recipe: Generate to IO Stream Without Headers}[#label-Recipe-3A+Generate+to+IO+Stream+Without+Headers]
- {Converting Fields}[#label-Converting+Fields]
- {Recipe: Filter Generated Field Strings}[#label-Recipe-3A+Filter+Generated+Field+Strings]
- {Recipe: Specify Multiple Write Converters}[#label-Recipe-3A+Specify+Multiple+Write+Converters]
- {RFC 4180 Compliance}[#label-RFC+4180+Compliance]
- {Row Separator}[#label-Row+Separator]
- {Recipe: Generate Compliant Row Separator}[#label-Recipe-3A+Generate+Compliant+Row+Separator]
- {Recipe: Generate Non-Compliant Row Separator}[#label-Recipe-3A+Generate+Non-Compliant+Row+Separator]
- {Column Separator}[#label-Column+Separator]
- {Recipe: Generate Compliant Column Separator}[#label-Recipe-3A+Generate+Compliant+Column+Separator]
- {Recipe: Generate Non-Compliant Column Separator}[#label-Recipe-3A+Generate+Non-Compliant+Column+Separator]
- {Quotes}[#label-Quotes]
- {Recipe: Quote All Fields}[#label-Recipe-3A+Quote+All+Fields]
- {Recipe: Quote Empty Fields}[#label-Recipe-3A+Quote+Empty+Fields]
- {Recipe: Generate Compliant Quote Character}[#label-Recipe-3A+Generate+Compliant+Quote+Character]
- {Recipe: Generate Non-Compliant Quote Character}[#label-Recipe-3A+Generate+Non-Compliant+Quote+Character]
=== Output Formats
You can generate \CSV output to a \String, to a \File (via its path), or to an \IO stream.
==== Generating to a \String
You can generate \CSV output to a \String, with or without headers.
===== Recipe: Generate to \String with Headers
Use class method CSV.generate with option +headers+ to generate to a \String.
This example uses method CSV#<< to append the rows
that are to be generated:
output_string = CSV.generate('', headers: ['Name', 'Value'], write_headers: true) do |csv|
csv << ['Foo', 0]
csv << ['Bar', 1]
csv << ['Baz', 2]
end
output_string # => "Name,Value\nFoo,0\nBar,1\nBaz,2\n"
===== Recipe: Generate to \String Without Headers
Use class method CSV.generate without option +headers+ to generate to a \String.
This example uses method CSV#<< to append the rows
that are to be generated:
output_string = CSV.generate do |csv|
csv << ['Foo', 0]
csv << ['Bar', 1]
csv << ['Baz', 2]
end
output_string # => "Foo,0\nBar,1\nBaz,2\n"
==== Generating to a \File
You can generate /CSV data to a \File, with or without headers.
===== Recipe: Generate to \File with Headers
Use class method CSV.open with option +headers+ generate to a \File.
This example uses method CSV#<< to append the rows
that are to be generated:
path = 't.csv'
CSV.open(path, 'w', headers: ['Name', 'Value'], write_headers: true) do |csv|
csv << ['Foo', 0]
csv << ['Bar', 1]
csv << ['Baz', 2]
end
p File.read(path) # => "Name,Value\nFoo,0\nBar,1\nBaz,2\n"
===== Recipe: Generate to \File Without Headers
Use class method CSV.open without option +headers+ to generate to a \File.
This example uses method CSV#<< to append the rows
that are to be generated:
path = 't.csv'
CSV.open(path, 'w') do |csv|
csv << ['Foo', 0]
csv << ['Bar', 1]
csv << ['Baz', 2]
end
p File.read(path) # => "Foo,0\nBar,1\nBaz,2\n"
==== Generating to an \IO Stream
You can generate \CSV data to an \IO stream, with or without headers.
==== Recipe: Generate to \IO Stream with Headers
Use class method CSV.new with option +headers+ to generate \CSV data to an \IO stream:
path = 't.csv'
File.open(path, 'w') do |file|
csv = CSV.new(file, headers: ['Name', 'Value'], write_headers: true)
csv << ['Foo', 0]
csv << ['Bar', 1]
csv << ['Baz', 2]
end
p File.read(path) # => "Name,Value\nFoo,0\nBar,1\nBaz,2\n"
===== Recipe: Generate to \IO Stream Without Headers
Use class method CSV.new without option +headers+ to generate \CSV data to an \IO stream:
path = 't.csv'
File.open(path, 'w') do |file|
csv = CSV.new(file)
csv << ['Foo', 0]
csv << ['Bar', 1]
csv << ['Baz', 2]
end
p File.read(path) # => "Foo,0\nBar,1\nBaz,2\n"
=== Converting Fields
You can use _write_ _converters_ to convert fields when generating \CSV.
==== Recipe: Filter Generated Field Strings
Use option :write_converters and a custom converter to convert field values when generating \CSV.
This example defines and uses a custom write converter to strip whitespace from generated fields:
strip_converter = proc {|field| field.respond_to?(:strip) ? field.strip : field }
output_string = CSV.generate(write_converters: strip_converter) do |csv|
csv << [' foo ', 0]
csv << [' bar ', 1]
csv << [' baz ', 2]
end
output_string # => "foo,0\nbar,1\nbaz,2\n"
==== Recipe: Specify Multiple Write Converters
Use option :write_converters and multiple custom converters
to convert field values when generating \CSV.
This example defines and uses two custom write converters to strip and upcase generated fields:
strip_converter = proc {|field| field.respond_to?(:strip) ? field.strip : field }
upcase_converter = proc {|field| field.respond_to?(:upcase) ? field.upcase : field }
converters = [strip_converter, upcase_converter]
output_string = CSV.generate(write_converters: converters) do |csv|
csv << [' foo ', 0]
csv << [' bar ', 1]
csv << [' baz ', 2]
end
output_string # => "FOO,0\nBAR,1\nBAZ,2\n"
=== RFC 4180 Compliance
By default, \CSV generates data that is compliant with
{RFC 4180}[https://www.rfc-editor.org/rfc/rfc4180]
with respect to:
- Column separator.
- Quote character.
==== Row Separator
RFC 4180 specifies the row separator CRLF (Ruby "\r\n").
===== Recipe: Generate Compliant Row Separator
For strict compliance, use option +:row_sep+ to specify row separator "\r\n":
output_string = CSV.generate('', row_sep: "\r\n") do |csv|
csv << ['Foo', 0]
csv << ['Bar', 1]
csv << ['Baz', 2]
end
output_string # => "Foo,0\r\nBar,1\r\nBaz,2\r\n"
===== Recipe: Generate Non-Compliant Row Separator
For data with non-compliant row separators, use option +:row_sep+ with a different value:
This example source uses semicolon (";') as its row separator:
output_string = CSV.generate('', row_sep: ";") do |csv|
csv << ['Foo', 0]
csv << ['Bar', 1]
csv << ['Baz', 2]
end
output_string # => "Foo,0;Bar,1;Baz,2;"
==== Column Separator
RFC 4180 specifies column separator COMMA (Ruby ",").
===== Recipe: Generate Compliant Column Separator
Because the \CSV default comma separator is ",",
you need not specify option +:col_sep+ for compliant data:
output_string = CSV.generate('') do |csv|
csv << ['Foo', 0]
csv << ['Bar', 1]
csv << ['Baz', 2]
end
output_string # => "Foo,0\nBar,1\nBaz,2\n"
===== Recipe: Generate Non-Compliant Column Separator
For data with non-compliant column separators, use option +:col_sep+.
This example source uses TAB ("\t") as its column separator:
output_string = CSV.generate('', col_sep: "\t") do |csv|
csv << ['Foo', 0]
csv << ['Bar', 1]
csv << ['Baz', 2]
end
output_string # => "Foo\t0\nBar\t1\nBaz\t2\n"
==== Quotes
IFC 4180 allows most fields to be quoted or not.
By default, \CSV does not quote most fields.
However, a field containing the current row separator, column separator,
or quote character is automatically quoted, producing IFC 4180 compliance:
# Field contains row separator.
output_string = CSV.generate('') do |csv|
row_sep = csv.row_sep
csv << ["Foo#{row_sep}Foo", 0]
csv << ['Bar', 1]
csv << ['Baz', 2]
end
output_string # => "\"Foo\nFoo\",0\nBar,1\nBaz,2\n"
# Field contains column separator.
output_string = CSV.generate('') do |csv|
col_sep = csv.col_sep
csv << ["Foo#{col_sep}Foo", 0]
csv << ['Bar', 1]
csv << ['Baz', 2]
end
output_string # => "\"Foo,Foo\",0\nBar,1\nBaz,2\n"
# Field contains quote character.
output_string = CSV.generate('') do |csv|
quote_char = csv.quote_char
csv << ["Foo#{quote_char}Foo", 0]
csv << ['Bar', 1]
csv << ['Baz', 2]
end
output_string # => "\"Foo\"\"Foo\",0\nBar,1\nBaz,2\n"
===== Recipe: Quote All Fields
Use option +:force_quotes+ to force quoted fields:
output_string = CSV.generate('', force_quotes: true) do |csv|
csv << ['Foo', 0]
csv << ['Bar', 1]
csv << ['Baz', 2]
end
output_string # => "\"Foo\",\"0\"\n\"Bar\",\"1\"\n\"Baz\",\"2\"\n"
===== Recipe: Quote Empty Fields
Use option +:quote_empty+ to force quoting for empty fields:
output_string = CSV.generate('', quote_empty: true) do |csv|
csv << ['Foo', 0]
csv << ['Bar', 1]
csv << ['', 2]
end
output_string # => "Foo,0\nBar,1\n\"\",2\n"
===== Recipe: Generate Compliant Quote Character
RFC 4180 specifies quote character DQUOTE (Ruby "\"").
Because the \CSV default quote character is also "\"",
you need not specify option +:quote_char+ for compliant data:
output_string = CSV.generate('', force_quotes: true) do |csv|
csv << ['Foo', 0]
csv << ['Bar', 1]
csv << ['Baz', 2]
end
output_string # => "\"Foo\",\"0\"\n\"Bar\",\"1\"\n\"Baz\",\"2\"\n"
===== Recipe: Generate Non-Compliant Quote Character
For data with non-compliant quote characters, use option +:quote_char+.
This example source uses SQUOTE ("'") as its quote character:
output_string = CSV.generate('', quote_char: "'", force_quotes: true) do |csv|
csv << ['Foo', 0]
csv << ['Bar', 1]
csv << ['Baz', 2]
end
output_string # => "'Foo','0'\n'Bar','1'\n'Baz','2'\n"
csv-3.3.4/doc/csv/recipes/recipes.rdoc 0000644 0000041 0000041 00000000362 15000146530 017634 0 ustar www-data www-data == Recipes for \CSV
The recipes are specific code examples for specific tasks. See:
- {Recipes for Parsing CSV}[./parsing_rdoc.html]
- {Recipes for Generating CSV}[./generating_rdoc.html]
- {Recipes for Filtering CSV}[./filtering_rdoc.html]
csv-3.3.4/doc/csv/recipes/filtering.rdoc 0000644 0000041 0000041 00000021300 15000146530 020160 0 ustar www-data www-data == Recipes for Filtering \CSV
These recipes are specific code examples for specific \CSV filtering tasks.
For other recipes, see {Recipes for CSV}[./recipes_rdoc.html].
All code snippets on this page assume that the following has been executed:
require 'csv'
=== Contents
- {Source and Output Formats}[#label-Source+and+Output+Formats]
- {Filtering String to String}[#label-Filtering+String+to+String]
- {Recipe: Filter String to String parsing Headers}[#label-Recipe-3A+Filter+String+to+String+parsing+Headers]
- {Recipe: Filter String to String parsing and writing Headers}[#label-Recipe-3A+Filter+String+to+String+parsing+and+writing+Headers]
- {Recipe: Filter String to String Without Headers}[#label-Recipe-3A+Filter+String+to+String+Without+Headers]
- {Filtering String to IO Stream}[#label-Filtering+String+to+IO+Stream]
- {Recipe: Filter String to IO Stream parsing Headers}[#label-Recipe-3A+Filter+String+to+IO+Stream+parsing+Headers]
- {Recipe: Filter String to IO Stream parsing and writing Headers}[#label-Recipe-3A+Filter+String+to+IO+Stream+parsing+and+writing+Headers]
- {Recipe: Filter String to IO Stream Without Headers}[#label-Recipe-3A+Filter+String+to+IO+Stream+Without+Headers]
- {Filtering IO Stream to String}[#label-Filtering+IO+Stream+to+String]
- {Recipe: Filter IO Stream to String parsing Headers}[#label-Recipe-3A+Filter+IO+Stream+to+String+parsing+Headers]
- {Recipe: Filter IO Stream to String parsing and writing Headers}[#label-Recipe-3A+Filter+IO+Stream+to+String+parsing+and+writing+Headers]
- {Recipe: Filter IO Stream to String Without Headers}[#label-Recipe-3A+Filter+IO+Stream+to+String+Without+Headers]
- {Filtering IO Stream to IO Stream}[#label-Filtering+IO+Stream+to+IO+Stream]
- {Recipe: Filter IO Stream to IO Stream parsing Headers}[#label-Recipe-3A+Filter+IO+Stream+to+IO+Stream+parsing+Headers]
- {Recipe: Filter IO Stream to IO Stream parsing and writing Headers}[#label-Recipe-3A+Filter+IO+Stream+to+IO+Stream+parsing+and+writing+Headers]
- {Recipe: Filter IO Stream to IO Stream Without Headers}[#label-Recipe-3A+Filter+IO+Stream+to+IO+Stream+Without+Headers]
=== Source and Output Formats
You can use a Unix-style "filter" for \CSV data.
The filter reads source \CSV data and writes output \CSV data as modified by the filter.
The input and output \CSV data may be any mixture of \Strings and \IO streams.
==== Filtering \String to \String
You can filter one \String to another, with or without headers.
===== Recipe: Filter \String to \String parsing Headers
Use class method CSV.filter with option +headers+ to filter a \String to another \String:
in_string = "Name,Value\nfoo,0\nbar,1\nbaz,2\n"
out_string = ''
CSV.filter(in_string, out_string, headers: true) do |row|
row['Name'] = row['Name'].upcase
row['Value'] *= 4
end
out_string # => "FOO,0000\nBAR,1111\nBAZ,2222\n"
===== Recipe: Filter \String to \String parsing and writing Headers
Use class method CSV.filter with option +headers+ and +out_write_headers+ to filter a \String to another \String including header row:
in_string = "Name,Value\nfoo,0\nbar,1\nbaz,2\n"
out_string = ''
CSV.filter(in_string, out_string, headers: true, out_write_headers: true) do |row|
unless row.is_a?(Array)
row['Name'] = row['Name'].upcase
row['Value'] *= 4
end
end
out_string # => "Name,Value\nFOO,0000\nBAR,1111\nBAZ,2222\n"
===== Recipe: Filter \String to \String Without Headers
Use class method CSV.filter without option +headers+ to filter a \String to another \String:
in_string = "foo,0\nbar,1\nbaz,2\n"
out_string = ''
CSV.filter(in_string, out_string) do |row|
row[0] = row[0].upcase
row[1] *= 4
end
out_string # => "FOO,0000\nBAR,1111\nBAZ,2222\n"
==== Filtering \String to \IO Stream
You can filter a \String to an \IO stream, with or without headers.
===== Recipe: Filter \String to \IO Stream parsing Headers
Use class method CSV.filter with option +headers+ to filter a \String to an \IO stream:
in_string = "Name,Value\nfoo,0\nbar,1\nbaz,2\n"
path = 't.csv'
File.open(path, 'w') do |out_io|
CSV.filter(in_string, out_io, headers: true) do |row|
row['Name'] = row['Name'].upcase
row['Value'] *= 4
end
end
p File.read(path) # => "FOO,0000\nBAR,1111\nBAZ,2222\n"
===== Recipe: Filter \String to \IO Stream parsing and writing Headers
Use class method CSV.filter with option +headers+ and +out_write_headers+ to filter a \String to an \IO stream including header row:
in_string = "Name,Value\nfoo,0\nbar,1\nbaz,2\n"
path = 't.csv'
File.open(path, 'w') do |out_io|
CSV.filter(in_string, out_io, headers: true, out_write_headers: true ) do |row|
unless row.is_a?(Array)
row['Name'] = row['Name'].upcase
row['Value'] *= 4
end
end
end
p File.read(path) # => "Name,Value\nFOO,0000\nBAR,1111\nBAZ,2222\n"
===== Recipe: Filter \String to \IO Stream Without Headers
Use class method CSV.filter without option +headers+ to filter a \String to an \IO stream:
in_string = "foo,0\nbar,1\nbaz,2\n"
path = 't.csv'
File.open(path, 'w') do |out_io|
CSV.filter(in_string, out_io) do |row|
row[0] = row[0].upcase
row[1] *= 4
end
end
p File.read(path) # => "FOO,0000\nBAR,1111\nBAZ,2222\n"
==== Filtering \IO Stream to \String
You can filter an \IO stream to a \String, with or without headers.
===== Recipe: Filter \IO Stream to \String parsing Headers
Use class method CSV.filter with option +headers+ to filter an \IO stream to a \String:
in_string = "Name,Value\nfoo,0\nbar,1\nbaz,2\n"
path = 't.csv'
File.write(path, in_string)
out_string = ''
File.open(path) do |in_io|
CSV.filter(in_io, out_string, headers: true) do |row|
row['Name'] = row['Name'].upcase
row['Value'] *= 4
end
end
out_string # => "FOO,0000\nBAR,1111\nBAZ,2222\n"
===== Recipe: Filter \IO Stream to \String parsing and writing Headers
Use class method CSV.filter with option +headers+ and +out_write_headers+ to filter an \IO stream to a \String including header row:
in_string = "Name,Value\nfoo,0\nbar,1\nbaz,2\n"
path = 't.csv'
File.write(path, in_string)
out_string = ''
File.open(path) do |in_io|
CSV.filter(in_io, out_string, headers: true, out_write_headers: true) do |row|
unless row.is_a?(Array)
row['Name'] = row['Name'].upcase
row['Value'] *= 4
end
end
end
out_string # => "Name,Value\nFOO,0000\nBAR,1111\nBAZ,2222\n"
===== Recipe: Filter \IO Stream to \String Without Headers
Use class method CSV.filter without option +headers+ to filter an \IO stream to a \String:
in_string = "foo,0\nbar,1\nbaz,2\n"
path = 't.csv'
File.write(path, in_string)
out_string = ''
File.open(path) do |in_io|
CSV.filter(in_io, out_string) do |row|
row[0] = row[0].upcase
row[1] *= 4
end
end
out_string # => "FOO,0000\nBAR,1111\nBAZ,2222\n"
==== Filtering \IO Stream to \IO Stream
You can filter an \IO stream to another \IO stream, with or without headers.
===== Recipe: Filter \IO Stream to \IO Stream parsing Headers
Use class method CSV.filter with option +headers+ to filter an \IO stream to another \IO stream:
in_path = 't.csv'
in_string = "Name,Value\nfoo,0\nbar,1\nbaz,2\n"
File.write(in_path, in_string)
out_path = 'u.csv'
File.open(in_path) do |in_io|
File.open(out_path, 'w') do |out_io|
CSV.filter(in_io, out_io, headers: true) do |row|
row['Name'] = row['Name'].upcase
row['Value'] *= 4
end
end
end
p File.read(out_path) # => "FOO,0000\nBAR,1111\nBAZ,2222\n"
===== Recipe: Filter \IO Stream to \IO Stream parsing and writing Headers
Use class method CSV.filter with option +headers+ and +out_write_headers+ to filter an \IO stream to another \IO stream including header row:
in_path = 't.csv'
in_string = "Name,Value\nfoo,0\nbar,1\nbaz,2\n"
File.write(in_path, in_string)
out_path = 'u.csv'
File.open(in_path) do |in_io|
File.open(out_path, 'w') do |out_io|
CSV.filter(in_io, out_io, headers: true, out_write_headers: true) do |row|
unless row.is_a?(Array)
row['Name'] = row['Name'].upcase
row['Value'] *= 4
end
end
end
end
p File.read(out_path) # => "Name,Value\nFOO,0000\nBAR,1111\nBAZ,2222\n"
===== Recipe: Filter \IO Stream to \IO Stream Without Headers
Use class method CSV.filter without option +headers+ to filter an \IO stream to another \IO stream:
in_path = 't.csv'
in_string = "foo,0\nbar,1\nbaz,2\n"
File.write(in_path, in_string)
out_path = 'u.csv'
File.open(in_path) do |in_io|
File.open(out_path, 'w') do |out_io|
CSV.filter(in_io, out_io) do |row|
row[0] = row[0].upcase
row[1] *= 4
end
end
end
p File.read(out_path) # => "FOO,0000\nBAR,1111\nBAZ,2222\n"
csv-3.3.4/doc/csv/arguments/ 0000755 0000041 0000041 00000000000 15000146530 015703 5 ustar www-data www-data csv-3.3.4/doc/csv/arguments/io.rdoc 0000644 0000041 0000041 00000000433 15000146530 017163 0 ustar www-data www-data * Argument +io+ should be an IO object that is:
* Open for reading; on return, the IO object will be closed.
* Positioned at the beginning.
To position at the end, for appending, use method CSV.generate.
For any other positioning, pass a preset \StringIO object instead.
csv-3.3.4/lib/ 0000755 0000041 0000041 00000000000 15000146530 013104 5 ustar www-data www-data csv-3.3.4/lib/csv.rb 0000644 0000041 0000041 00000303515 15000146530 014233 0 ustar www-data www-data # encoding: US-ASCII
# frozen_string_literal: true
# = csv.rb -- CSV Reading and Writing
#
# Created by James Edward Gray II on 2005-10-31.
#
# See CSV for documentation.
#
# == Description
#
# Welcome to the new and improved CSV.
#
# This version of the CSV library began its life as FasterCSV. FasterCSV was
# intended as a replacement to Ruby's then standard CSV library. It was
# designed to address concerns users of that library had and it had three
# primary goals:
#
# 1. Be significantly faster than CSV while remaining a pure Ruby library.
# 2. Use a smaller and easier to maintain code base. (FasterCSV eventually
# grew larger, was also but considerably richer in features. The parsing
# core remains quite small.)
# 3. Improve on the CSV interface.
#
# Obviously, the last one is subjective. I did try to defer to the original
# interface whenever I didn't have a compelling reason to change it though, so
# hopefully this won't be too radically different.
#
# We must have met our goals because FasterCSV was renamed to CSV and replaced
# the original library as of Ruby 1.9. If you are migrating code from 1.8 or
# earlier, you may have to change your code to comply with the new interface.
#
# == What's the Different From the Old CSV?
#
# I'm sure I'll miss something, but I'll try to mention most of the major
# differences I am aware of, to help others quickly get up to speed:
#
# === \CSV Parsing
#
# * This parser is m17n aware. See CSV for full details.
# * This library has a stricter parser and will throw MalformedCSVErrors on
# problematic data.
# * This library has a less liberal idea of a line ending than CSV. What you
# set as the :row_sep is law. It can auto-detect your line endings
# though.
# * The old library returned empty lines as [nil]. This library calls
# them [].
# * This library has a much faster parser.
#
# === Interface
#
# * CSV now uses keyword parameters to set options.
# * CSV no longer has generate_row() or parse_row().
# * The old CSV's Reader and Writer classes have been dropped.
# * CSV::open() is now more like Ruby's open().
# * CSV objects now support most standard IO methods.
# * CSV now has a new() method used to wrap objects like String and IO for
# reading and writing.
# * CSV::generate() is different from the old method.
# * CSV no longer supports partial reads. It works line-by-line.
# * CSV no longer allows the instance methods to override the separators for
# performance reasons. They must be set in the constructor.
#
# If you use this library and find yourself missing any functionality I have
# trimmed, please {let me know}[mailto:james@grayproductions.net].
#
# == Documentation
#
# See CSV for documentation.
#
# == What is CSV, really?
#
# CSV maintains a pretty strict definition of CSV taken directly from
# {the RFC}[https://www.ietf.org/rfc/rfc4180.txt]. I relax the rules in only one
# place and that is to make using this library easier. CSV will parse all valid
# CSV.
#
# What you don't want to do is to feed CSV invalid data. Because of the way the
# CSV format works, it's common for a parser to need to read until the end of
# the file to be sure a field is invalid. This consumes a lot of time and memory.
#
# Luckily, when working with invalid CSV, Ruby's built-in methods will almost
# always be superior in every way. For example, parsing non-quoted fields is as
# easy as:
#
# data.split(",")
#
# == Questions and/or Comments
#
# Feel free to email {James Edward Gray II}[mailto:james@grayproductions.net]
# with any questions.
require "forwardable"
require "date"
require "time"
require "stringio"
require_relative "csv/fields_converter"
require_relative "csv/input_record_separator"
require_relative "csv/parser"
require_relative "csv/row"
require_relative "csv/table"
require_relative "csv/writer"
# == \CSV
#
# === \CSV Data
#
# \CSV (comma-separated values) data is a text representation of a table:
# - A _row_ _separator_ delimits table rows.
# A common row separator is the newline character "\n".
# - A _column_ _separator_ delimits fields in a row.
# A common column separator is the comma character ",".
#
# This \CSV \String, with row separator "\n"
# and column separator ",",
# has three rows and two columns:
# "foo,0\nbar,1\nbaz,2\n"
#
# Despite the name \CSV, a \CSV representation can use different separators.
#
# For more about tables, see the Wikipedia article
# "{Table (information)}[https://en.wikipedia.org/wiki/Table_(information)]",
# especially its section
# "{Simple table}[https://en.wikipedia.org/wiki/Table_(information)#Simple_table]"
#
# == \Class \CSV
#
# Class \CSV provides methods for:
# - Parsing \CSV data from a \String object, a \File (via its file path), or an \IO object.
# - Generating \CSV data to a \String object.
#
# To make \CSV available:
# require 'csv'
#
# All examples here assume that this has been done.
#
# == Keeping It Simple
#
# A \CSV object has dozens of instance methods that offer fine-grained control
# of parsing and generating \CSV data.
# For many needs, though, simpler approaches will do.
#
# This section summarizes the singleton methods in \CSV
# that allow you to parse and generate without explicitly
# creating \CSV objects.
# For details, follow the links.
#
# === Simple Parsing
#
# Parsing methods commonly return either of:
# - An \Array of Arrays of Strings:
# - The outer \Array is the entire "table".
# - Each inner \Array is a row.
# - Each \String is a field.
# - A CSV::Table object. For details, see
# {\CSV with Headers}[#class-CSV-label-CSV+with+Headers].
#
# ==== Parsing a \String
#
# The input to be parsed can be a string:
# string = "foo,0\nbar,1\nbaz,2\n"
#
# \Method CSV.parse returns the entire \CSV data:
# CSV.parse(string) # => [["foo", "0"], ["bar", "1"], ["baz", "2"]]
#
# \Method CSV.parse_line returns only the first row:
# CSV.parse_line(string) # => ["foo", "0"]
#
# \CSV extends class \String with instance method String#parse_csv,
# which also returns only the first row:
# string.parse_csv # => ["foo", "0"]
#
# ==== Parsing Via a \File Path
#
# The input to be parsed can be in a file:
# string = "foo,0\nbar,1\nbaz,2\n"
# path = 't.csv'
# File.write(path, string)
#
# \Method CSV.read returns the entire \CSV data:
# CSV.read(path) # => [["foo", "0"], ["bar", "1"], ["baz", "2"]]
#
# \Method CSV.foreach iterates, passing each row to the given block:
# CSV.foreach(path) do |row|
# p row
# end
# Output:
# ["foo", "0"]
# ["bar", "1"]
# ["baz", "2"]
#
# \Method CSV.table returns the entire \CSV data as a CSV::Table object:
# CSV.table(path) # => #
#
# ==== Parsing from an Open \IO Stream
#
# The input to be parsed can be in an open \IO stream:
#
# \Method CSV.read returns the entire \CSV data:
# File.open(path) do |file|
# CSV.read(file)
# end # => [["foo", "0"], ["bar", "1"], ["baz", "2"]]
#
# As does method CSV.parse:
# File.open(path) do |file|
# CSV.parse(file)
# end # => [["foo", "0"], ["bar", "1"], ["baz", "2"]]
#
# \Method CSV.parse_line returns only the first row:
# File.open(path) do |file|
# CSV.parse_line(file)
# end # => ["foo", "0"]
#
# \Method CSV.foreach iterates, passing each row to the given block:
# File.open(path) do |file|
# CSV.foreach(file) do |row|
# p row
# end
# end
# Output:
# ["foo", "0"]
# ["bar", "1"]
# ["baz", "2"]
#
# \Method CSV.table returns the entire \CSV data as a CSV::Table object:
# File.open(path) do |file|
# CSV.table(file)
# end # => #
#
# === Simple Generating
#
# \Method CSV.generate returns a \String;
# this example uses method CSV#<< to append the rows
# that are to be generated:
# output_string = CSV.generate do |csv|
# csv << ['foo', 0]
# csv << ['bar', 1]
# csv << ['baz', 2]
# end
# output_string # => "foo,0\nbar,1\nbaz,2\n"
#
# \Method CSV.generate_line returns a \String containing the single row
# constructed from an \Array:
# CSV.generate_line(['foo', '0']) # => "foo,0\n"
#
# \CSV extends class \Array with instance method Array#to_csv,
# which forms an \Array into a \String:
# ['foo', '0'].to_csv # => "foo,0\n"
#
# === "Filtering" \CSV
#
# \Method CSV.filter provides a Unix-style filter for \CSV data.
# The input data is processed to form the output data:
# in_string = "foo,0\nbar,1\nbaz,2\n"
# out_string = ''
# CSV.filter(in_string, out_string) do |row|
# row[0] = row[0].upcase
# row[1] *= 4
# end
# out_string # => "FOO,0000\nBAR,1111\nBAZ,2222\n"
#
# == \CSV Objects
#
# There are three ways to create a \CSV object:
# - \Method CSV.new returns a new \CSV object.
# - \Method CSV.instance returns a new or cached \CSV object.
# - \Method \CSV() also returns a new or cached \CSV object.
#
# === Instance Methods
#
# \CSV has three groups of instance methods:
# - Its own internally defined instance methods.
# - Methods included by module Enumerable.
# - Methods delegated to class IO. See below.
#
# ==== Delegated Methods
#
# For convenience, a CSV object will delegate to many methods in class IO.
# (A few have wrapper "guard code" in \CSV.) You may call:
# * IO#binmode
# * #binmode?
# * IO#close
# * IO#close_read
# * IO#close_write
# * IO#closed?
# * #eof
# * #eof?
# * IO#external_encoding
# * IO#fcntl
# * IO#fileno
# * #flock
# * IO#flush
# * IO#fsync
# * IO#internal_encoding
# * #ioctl
# * IO#isatty
# * #path
# * IO#pid
# * IO#pos
# * IO#pos=
# * IO#reopen
# * #rewind
# * IO#seek
# * #stat
# * IO#string
# * IO#sync
# * IO#sync=
# * IO#tell
# * #to_i
# * #to_io
# * IO#truncate
# * IO#tty?
#
# === Options
#
# The default values for options are:
# DEFAULT_OPTIONS = {
# # For both parsing and generating.
# col_sep: ",",
# row_sep: :auto,
# quote_char: '"',
# # For parsing.
# field_size_limit: nil,
# converters: nil,
# unconverted_fields: nil,
# headers: false,
# return_headers: false,
# header_converters: nil,
# skip_blanks: false,
# skip_lines: nil,
# liberal_parsing: false,
# nil_value: nil,
# empty_value: "",
# strip: false,
# # For generating.
# write_headers: nil,
# quote_empty: true,
# force_quotes: false,
# write_converters: nil,
# write_nil_value: nil,
# write_empty_value: "",
# }
#
# ==== Options for Parsing
#
# Options for parsing, described in detail below, include:
# - +row_sep+: Specifies the row separator; used to delimit rows.
# - +col_sep+: Specifies the column separator; used to delimit fields.
# - +quote_char+: Specifies the quote character; used to quote fields.
# - +field_size_limit+: Specifies the maximum field size + 1 allowed.
# Deprecated since 3.2.3. Use +max_field_size+ instead.
# - +max_field_size+: Specifies the maximum field size allowed.
# - +converters+: Specifies the field converters to be used.
# - +unconverted_fields+: Specifies whether unconverted fields are to be available.
# - +headers+: Specifies whether data contains headers,
# or specifies the headers themselves.
# - +return_headers+: Specifies whether headers are to be returned.
# - +header_converters+: Specifies the header converters to be used.
# - +skip_blanks+: Specifies whether blanks lines are to be ignored.
# - +skip_lines+: Specifies how comments lines are to be recognized.
# - +strip+: Specifies whether leading and trailing whitespace are to be
# stripped from fields. This must be compatible with +col_sep+; if it is not,
# then an +ArgumentError+ exception will be raised.
# - +liberal_parsing+: Specifies whether \CSV should attempt to parse
# non-compliant data.
# - +nil_value+: Specifies the object that is to be substituted for each null (no-text) field.
# - +empty_value+: Specifies the object that is to be substituted for each empty field.
#
# :include: ../doc/csv/options/common/row_sep.rdoc
#
# :include: ../doc/csv/options/common/col_sep.rdoc
#
# :include: ../doc/csv/options/common/quote_char.rdoc
#
# :include: ../doc/csv/options/parsing/field_size_limit.rdoc
#
# :include: ../doc/csv/options/parsing/converters.rdoc
#
# :include: ../doc/csv/options/parsing/unconverted_fields.rdoc
#
# :include: ../doc/csv/options/parsing/headers.rdoc
#
# :include: ../doc/csv/options/parsing/return_headers.rdoc
#
# :include: ../doc/csv/options/parsing/header_converters.rdoc
#
# :include: ../doc/csv/options/parsing/skip_blanks.rdoc
#
# :include: ../doc/csv/options/parsing/skip_lines.rdoc
#
# :include: ../doc/csv/options/parsing/strip.rdoc
#
# :include: ../doc/csv/options/parsing/liberal_parsing.rdoc
#
# :include: ../doc/csv/options/parsing/nil_value.rdoc
#
# :include: ../doc/csv/options/parsing/empty_value.rdoc
#
# ==== Options for Generating
#
# Options for generating, described in detail below, include:
# - +row_sep+: Specifies the row separator; used to delimit rows.
# - +col_sep+: Specifies the column separator; used to delimit fields.
# - +quote_char+: Specifies the quote character; used to quote fields.
# - +write_headers+: Specifies whether headers are to be written.
# - +force_quotes+: Specifies whether each output field is to be quoted.
# - +quote_empty+: Specifies whether each empty output field is to be quoted.
# - +write_converters+: Specifies the field converters to be used in writing.
# - +write_nil_value+: Specifies the object that is to be substituted for each +nil+-valued field.
# - +write_empty_value+: Specifies the object that is to be substituted for each empty field.
#
# :include: ../doc/csv/options/common/row_sep.rdoc
#
# :include: ../doc/csv/options/common/col_sep.rdoc
#
# :include: ../doc/csv/options/common/quote_char.rdoc
#
# :include: ../doc/csv/options/generating/write_headers.rdoc
#
# :include: ../doc/csv/options/generating/force_quotes.rdoc
#
# :include: ../doc/csv/options/generating/quote_empty.rdoc
#
# :include: ../doc/csv/options/generating/write_converters.rdoc
#
# :include: ../doc/csv/options/generating/write_nil_value.rdoc
#
# :include: ../doc/csv/options/generating/write_empty_value.rdoc
#
# === \CSV with Headers
#
# CSV allows to specify column names of CSV file, whether they are in data, or
# provided separately. If headers are specified, reading methods return an instance
# of CSV::Table, consisting of CSV::Row.
#
# # Headers are part of data
# data = CSV.parse(<<~ROWS, headers: true)
# Name,Department,Salary
# Bob,Engineering,1000
# Jane,Sales,2000
# John,Management,5000
# ROWS
#
# data.class #=> CSV::Table
# data.first #=> #
# data.first.to_h #=> {"Name"=>"Bob", "Department"=>"Engineering", "Salary"=>"1000"}
#
# # Headers provided by developer
# data = CSV.parse('Bob,Engineering,1000', headers: %i[name department salary])
# data.first #=> #
#
# === \Converters
#
# By default, each value (field or header) parsed by \CSV is formed into a \String.
# You can use a _field_ _converter_ or _header_ _converter_
# to intercept and modify the parsed values:
# - See {Field Converters}[#class-CSV-label-Field+Converters].
# - See {Header Converters}[#class-CSV-label-Header+Converters].
#
# Also by default, each value to be written during generation is written 'as-is'.
# You can use a _write_ _converter_ to modify values before writing.
# - See {Write Converters}[#class-CSV-label-Write+Converters].
#
# ==== Specifying \Converters
#
# You can specify converters for parsing or generating in the +options+
# argument to various \CSV methods:
# - Option +converters+ for converting parsed field values.
# - Option +header_converters+ for converting parsed header values.
# - Option +write_converters+ for converting values to be written (generated).
#
# There are three forms for specifying converters:
# - A converter proc: executable code to be used for conversion.
# - A converter name: the name of a stored converter.
# - A converter list: an array of converter procs, converter names, and converter lists.
#
# ===== Converter Procs
#
# This converter proc, +strip_converter+, accepts a value +field+
# and returns field.strip:
# strip_converter = proc {|field| field.strip }
# In this call to CSV.parse,
# the keyword argument converters: string_converter
# specifies that:
# - \Proc +string_converter+ is to be called for each parsed field.
# - The converter's return value is to replace the +field+ value.
# Example:
# string = " foo , 0 \n bar , 1 \n baz , 2 \n"
# array = CSV.parse(string, converters: strip_converter)
# array # => [["foo", "0"], ["bar", "1"], ["baz", "2"]]
#
# A converter proc can receive a second argument, +field_info+,
# that contains details about the field.
# This modified +strip_converter+ displays its arguments:
# strip_converter = proc do |field, field_info|
# p [field, field_info]
# field.strip
# end
# string = " foo , 0 \n bar , 1 \n baz , 2 \n"
# array = CSV.parse(string, converters: strip_converter)
# array # => [["foo", "0"], ["bar", "1"], ["baz", "2"]]
# Output:
# [" foo ", #]
# [" 0 ", #]
# [" bar ", #]
# [" 1 ", #]
# [" baz ", #]
# [" 2 ", #]
# Each CSV::FieldInfo object shows:
# - The 0-based field index.
# - The 1-based line index.
# - The field header, if any.
#
# ===== Stored \Converters
#
# A converter may be given a name and stored in a structure where
# the parsing methods can find it by name.
#
# The storage structure for field converters is the \Hash CSV::Converters.
# It has several built-in converter procs:
# - :integer: converts each \String-embedded integer into a true \Integer.
# - :float: converts each \String-embedded float into a true \Float.
# - :date: converts each \String-embedded date into a true \Date.
# - :date_time: converts each \String-embedded date-time into a true \DateTime
# - :time: converts each \String-embedded time into a true \Time
# .
# This example creates a converter proc, then stores it:
# strip_converter = proc {|field| field.strip }
# CSV::Converters[:strip] = strip_converter
# Then the parsing method call can refer to the converter
# by its name, :strip:
# string = " foo , 0 \n bar , 1 \n baz , 2 \n"
# array = CSV.parse(string, converters: :strip)
# array # => [["foo", "0"], ["bar", "1"], ["baz", "2"]]
#
# The storage structure for header converters is the \Hash CSV::HeaderConverters,
# which works in the same way.
# It also has built-in converter procs:
# - :downcase: Downcases each header.
# - :symbol: Converts each header to a \Symbol.
#
# There is no such storage structure for write headers.
#
# In order for the parsing methods to access stored converters in non-main-Ractors, the
# storage structure must be made shareable first.
# Therefore, Ractor.make_shareable(CSV::Converters) and
# Ractor.make_shareable(CSV::HeaderConverters) must be called before the creation
# of Ractors that use the converters stored in these structures. (Since making the storage
# structures shareable involves freezing them, any custom converters that are to be used
# must be added first.)
#
# ===== Converter Lists
#
# A _converter_ _list_ is an \Array that may include any assortment of:
# - Converter procs.
# - Names of stored converters.
# - Nested converter lists.
#
# Examples:
# numeric_converters = [:integer, :float]
# date_converters = [:date, :date_time]
# [numeric_converters, strip_converter]
# [strip_converter, date_converters, :float]
#
# Like a converter proc, a converter list may be named and stored in either
# \CSV::Converters or CSV::HeaderConverters:
# CSV::Converters[:custom] = [strip_converter, date_converters, :float]
# CSV::HeaderConverters[:custom] = [:downcase, :symbol]
#
# There are two built-in converter lists:
# CSV::Converters[:numeric] # => [:integer, :float]
# CSV::Converters[:all] # => [:date_time, :numeric]
#
# ==== Field \Converters
#
# With no conversion, all parsed fields in all rows become Strings:
# string = "foo,0\nbar,1\nbaz,2\n"
# ary = CSV.parse(string)
# ary # => # => [["foo", "0"], ["bar", "1"], ["baz", "2"]]
#
# When you specify a field converter, each parsed field is passed to the converter;
# its return value becomes the stored value for the field.
# A converter might, for example, convert an integer embedded in a \String
# into a true \Integer.
# (In fact, that's what built-in field converter +:integer+ does.)
#
# There are three ways to use field \converters.
#
# - Using option {converters}[#class-CSV-label-Option+converters] with a parsing method:
# ary = CSV.parse(string, converters: :integer)
# ary # => [0, 1, 2] # => [["foo", 0], ["bar", 1], ["baz", 2]]
# - Using option {converters}[#class-CSV-label-Option+converters] with a new \CSV instance:
# csv = CSV.new(string, converters: :integer)
# # Field converters in effect:
# csv.converters # => [:integer]
# csv.read # => [["foo", 0], ["bar", 1], ["baz", 2]]
# - Using method #convert to add a field converter to a \CSV instance:
# csv = CSV.new(string)
# # Add a converter.
# csv.convert(:integer)
# csv.converters # => [:integer]
# csv.read # => [["foo", 0], ["bar", 1], ["baz", 2]]
#
# Installing a field converter does not affect already-read rows:
# csv = CSV.new(string)
# csv.shift # => ["foo", "0"]
# # Add a converter.
# csv.convert(:integer)
# csv.converters # => [:integer]
# csv.read # => [["bar", 1], ["baz", 2]]
#
# There are additional built-in \converters, and custom \converters are also supported.
#
# ===== Built-In Field \Converters
#
# The built-in field converters are in \Hash CSV::Converters:
# - Each key is a field converter name.
# - Each value is one of:
# - A \Proc field converter.
# - An \Array of field converter names.
#
# Display:
# CSV::Converters.each_pair do |name, value|
# if value.kind_of?(Proc)
# p [name, value.class]
# else
# p [name, value]
# end
# end
# Output:
# [:integer, Proc]
# [:float, Proc]
# [:numeric, [:integer, :float]]
# [:date, Proc]
# [:date_time, Proc]
# [:time, Proc]
# [:all, [:date_time, :numeric]]
#
# Each of these converters transcodes values to UTF-8 before attempting conversion.
# If a value cannot be transcoded to UTF-8 the conversion will
# fail and the value will remain unconverted.
#
# Converter +:integer+ converts each field that Integer() accepts:
# data = '0,1,2,x'
# # Without the converter
# csv = CSV.parse_line(data)
# csv # => ["0", "1", "2", "x"]
# # With the converter
# csv = CSV.parse_line(data, converters: :integer)
# csv # => [0, 1, 2, "x"]
#
# Converter +:float+ converts each field that Float() accepts:
# data = '1.0,3.14159,x'
# # Without the converter
# csv = CSV.parse_line(data)
# csv # => ["1.0", "3.14159", "x"]
# # With the converter
# csv = CSV.parse_line(data, converters: :float)
# csv # => [1.0, 3.14159, "x"]
#
# Converter +:numeric+ converts with both +:integer+ and +:float+..
#
# Converter +:date+ converts each field that Date::parse accepts:
# data = '2001-02-03,x'
# # Without the converter
# csv = CSV.parse_line(data)
# csv # => ["2001-02-03", "x"]
# # With the converter
# csv = CSV.parse_line(data, converters: :date)
# csv # => [#, "x"]
#
# Converter +:date_time+ converts each field that DateTime::parse accepts:
# data = '2020-05-07T14:59:00-05:00,x'
# # Without the converter
# csv = CSV.parse_line(data)
# csv # => ["2020-05-07T14:59:00-05:00", "x"]
# # With the converter
# csv = CSV.parse_line(data, converters: :date_time)
# csv # => [#, "x"]
#
# Converter +time+ converts each field that Time::parse accepts:
# data = '2020-05-07T14:59:00-05:00,x'
# # Without the converter
# csv = CSV.parse_line(data)
# csv # => ["2020-05-07T14:59:00-05:00", "x"]
# # With the converter
# csv = CSV.parse_line(data, converters: :time)
# csv # => [2020-05-07 14:59:00 -0500, "x"]
#
# Converter +:numeric+ converts with both +:date_time+ and +:numeric+..
#
# As seen above, method #convert adds \converters to a \CSV instance,
# and method #converters returns an \Array of the \converters in effect:
# csv = CSV.new('0,1,2')
# csv.converters # => []
# csv.convert(:integer)
# csv.converters # => [:integer]
# csv.convert(:date)
# csv.converters # => [:integer, :date]
#
# ===== Custom Field \Converters
#
# You can define a custom field converter:
# strip_converter = proc {|field| field.strip }
# string = " foo , 0 \n bar , 1 \n baz , 2 \n"
# array = CSV.parse(string, converters: strip_converter)
# array # => [["foo", "0"], ["bar", "1"], ["baz", "2"]]
# You can register the converter in \Converters \Hash,
# which allows you to refer to it by name:
# CSV::Converters[:strip] = strip_converter
# string = " foo , 0 \n bar , 1 \n baz , 2 \n"
# array = CSV.parse(string, converters: :strip)
# array # => [["foo", "0"], ["bar", "1"], ["baz", "2"]]
#
# ==== Header \Converters
#
# Header converters operate only on headers (and not on other rows).
#
# There are three ways to use header \converters;
# these examples use built-in header converter +:downcase+,
# which downcases each parsed header.
#
# - Option +header_converters+ with a singleton parsing method:
# string = "Name,Count\nFoo,0\n,Bar,1\nBaz,2"
# tbl = CSV.parse(string, headers: true, header_converters: :downcase)
# tbl.class # => CSV::Table
# tbl.headers # => ["name", "count"]
#
# - Option +header_converters+ with a new \CSV instance:
# csv = CSV.new(string, header_converters: :downcase)
# # Header converters in effect:
# csv.header_converters # => [:downcase]
# tbl = CSV.parse(string, headers: true)
# tbl.headers # => ["Name", "Count"]
#
# - Method #header_convert adds a header converter to a \CSV instance:
# csv = CSV.new(string)
# # Add a header converter.
# csv.header_convert(:downcase)
# csv.header_converters # => [:downcase]
# tbl = CSV.parse(string, headers: true)
# tbl.headers # => ["Name", "Count"]
#
# ===== Built-In Header \Converters
#
# The built-in header \converters are in \Hash CSV::HeaderConverters.
# The keys there are the names of the \converters:
# CSV::HeaderConverters.keys # => [:downcase, :symbol]
#
# Converter +:downcase+ converts each header by downcasing it:
# string = "Name,Count\nFoo,0\n,Bar,1\nBaz,2"
# tbl = CSV.parse(string, headers: true, header_converters: :downcase)
# tbl.class # => CSV::Table
# tbl.headers # => ["name", "count"]
#
# Converter +:symbol+ converts each header by making it into a \Symbol:
# string = "Name,Count\nFoo,0\n,Bar,1\nBaz,2"
# tbl = CSV.parse(string, headers: true, header_converters: :symbol)
# tbl.headers # => [:name, :count]
# Details:
# - Strips leading and trailing whitespace.
# - Downcases the header.
# - Replaces embedded spaces with underscores.
# - Removes non-word characters.
# - Makes the string into a \Symbol.
#
# ===== Custom Header \Converters
#
# You can define a custom header converter:
# upcase_converter = proc {|header| header.upcase }
# string = "Name,Value\nfoo,0\nbar,1\nbaz,2\n"
# table = CSV.parse(string, headers: true, header_converters: upcase_converter)
# table # => #
# table.headers # => ["NAME", "VALUE"]
# You can register the converter in \HeaderConverters \Hash,
# which allows you to refer to it by name:
# CSV::HeaderConverters[:upcase] = upcase_converter
# table = CSV.parse(string, headers: true, header_converters: :upcase)
# table # => #
# table.headers # => ["NAME", "VALUE"]
#
# ===== Write \Converters
#
# When you specify a write converter for generating \CSV,
# each field to be written is passed to the converter;
# its return value becomes the new value for the field.
# A converter might, for example, strip whitespace from a field.
#
# Using no write converter (all fields unmodified):
# output_string = CSV.generate do |csv|
# csv << [' foo ', 0]
# csv << [' bar ', 1]
# csv << [' baz ', 2]
# end
# output_string # => " foo ,0\n bar ,1\n baz ,2\n"
# Using option +write_converters+ with two custom write converters:
# strip_converter = proc {|field| field.respond_to?(:strip) ? field.strip : field }
# upcase_converter = proc {|field| field.respond_to?(:upcase) ? field.upcase : field }
# write_converters = [strip_converter, upcase_converter]
# output_string = CSV.generate(write_converters: write_converters) do |csv|
# csv << [' foo ', 0]
# csv << [' bar ', 1]
# csv << [' baz ', 2]
# end
# output_string # => "FOO,0\nBAR,1\nBAZ,2\n"
#
# === Character Encodings (M17n or Multilingualization)
#
# This new CSV parser is m17n savvy. The parser works in the Encoding of the IO
# or String object being read from or written to. Your data is never transcoded
# (unless you ask Ruby to transcode it for you) and will literally be parsed in
# the Encoding it is in. Thus CSV will return Arrays or Rows of Strings in the
# Encoding of your data. This is accomplished by transcoding the parser itself
# into your Encoding.
#
# Some transcoding must take place, of course, to accomplish this multiencoding
# support. For example, :col_sep, :row_sep, and
# :quote_char must be transcoded to match your data. Hopefully this
# makes the entire process feel transparent, since CSV's defaults should just
# magically work for your data. However, you can set these values manually in
# the target Encoding to avoid the translation.
#
# It's also important to note that while all of CSV's core parser is now
# Encoding agnostic, some features are not. For example, the built-in
# converters will try to transcode data to UTF-8 before making conversions.
# Again, you can provide custom converters that are aware of your Encodings to
# avoid this translation. It's just too hard for me to support native
# conversions in all of Ruby's Encodings.
#
# Anyway, the practical side of this is simple: make sure IO and String objects
# passed into CSV have the proper Encoding set and everything should just work.
# CSV methods that allow you to open IO objects (CSV::foreach(), CSV::open(),
# CSV::read(), and CSV::readlines()) do allow you to specify the Encoding.
#
# One minor exception comes when generating CSV into a String with an Encoding
# that is not ASCII compatible. There's no existing data for CSV to use to
# prepare itself and thus you will probably need to manually specify the desired
# Encoding for most of those cases. It will try to guess using the fields in a
# row of output though, when using CSV::generate_line() or Array#to_csv().
#
# I try to point out any other Encoding issues in the documentation of methods
# as they come up.
#
# This has been tested to the best of my ability with all non-"dummy" Encodings
# Ruby ships with. However, it is brave new code and may have some bugs.
# Please feel free to {report}[mailto:james@grayproductions.net] any issues you
# find with it.
#
class CSV
# The error thrown when the parser encounters illegal CSV formatting.
class MalformedCSVError < RuntimeError
attr_reader :line_number
alias_method :lineno, :line_number
def initialize(message, line_number)
@line_number = line_number
super("#{message} in line #{line_number}.")
end
end
# The error thrown when the parser encounters invalid encoding in CSV.
class InvalidEncodingError < MalformedCSVError
attr_reader :encoding
def initialize(encoding, line_number)
@encoding = encoding
super("Invalid byte sequence in #{encoding}", line_number)
end
end
#
# A FieldInfo Struct contains details about a field's position in the data
# source it was read from. CSV will pass this Struct to some blocks that make
# decisions based on field structure. See CSV.convert_fields() for an
# example.
#
# index:: The zero-based index of the field in its row.
# line:: The line of the data source this row is from.
# header:: The header for the column, when available.
# quoted?:: True or false, whether the original value is quoted or not.
#
FieldInfo = Struct.new(:index, :line, :header, :quoted?)
# A Regexp used to find and convert some common Date formats.
DateMatcher = / \A(?: (\w+,?\s+)?\w+\s+\d{1,2},?\s+\d{2,4} |
\d{4}-\d{2}-\d{2} )\z /x
# A Regexp used to find and convert some common (Date)Time formats.
DateTimeMatcher =
/ \A(?: (\w+,?\s+)?\w+\s+\d{1,2}\s+\d{1,2}:\d{1,2}:\d{1,2},?\s+\d{2,4} |
# ISO-8601 and RFC-3339 (space instead of T) recognized by (Date)Time.parse
\d{4}-\d{2}-\d{2}
(?:[T\s]\d{2}:\d{2}(?::\d{2}(?:\.\d+)?(?:[+-]\d{2}(?::\d{2})|Z)?)?)?
)\z /x
# The encoding used by all converters.
ConverterEncoding = Encoding.find("UTF-8")
# A \Hash containing the names and \Procs for the built-in field converters.
# See {Built-In Field Converters}[#class-CSV-label-Built-In+Field+Converters].
#
# This \Hash is intentionally left unfrozen, and may be extended with
# custom field converters.
# See {Custom Field Converters}[#class-CSV-label-Custom+Field+Converters].
Converters = {
integer: lambda { |f|
Integer(f.encode(ConverterEncoding)) rescue f
},
float: lambda { |f|
Float(f.encode(ConverterEncoding)) rescue f
},
numeric: [:integer, :float],
date: lambda { |f|
begin
e = f.encode(ConverterEncoding)
e.match?(DateMatcher) ? Date.parse(e) : f
rescue # encoding conversion or date parse errors
f
end
},
date_time: lambda { |f|
begin
e = f.encode(ConverterEncoding)
e.match?(DateTimeMatcher) ? DateTime.parse(e) : f
rescue # encoding conversion or date parse errors
f
end
},
time: lambda { |f|
begin
e = f.encode(ConverterEncoding)
e.match?(DateTimeMatcher) ? Time.parse(e) : f
rescue # encoding conversion or parse errors
f
end
},
all: [:date_time, :numeric],
}
# A \Hash containing the names and \Procs for the built-in header converters.
# See {Built-In Header Converters}[#class-CSV-label-Built-In+Header+Converters].
#
# This \Hash is intentionally left unfrozen, and may be extended with
# custom field converters.
# See {Custom Header Converters}[#class-CSV-label-Custom+Header+Converters].
HeaderConverters = {
downcase: lambda { |h| h.encode(ConverterEncoding).downcase },
symbol: lambda { |h|
h.encode(ConverterEncoding).downcase.gsub(/[^\s\w]+/, "").strip.
gsub(/\s+/, "_").to_sym
},
symbol_raw: lambda { |h| h.encode(ConverterEncoding).to_sym }
}
# Default values for method options.
DEFAULT_OPTIONS = {
# For both parsing and generating.
col_sep: ",",
row_sep: :auto,
quote_char: '"',
# For parsing.
field_size_limit: nil,
max_field_size: nil,
converters: nil,
unconverted_fields: nil,
headers: false,
return_headers: false,
header_converters: nil,
skip_blanks: false,
skip_lines: nil,
liberal_parsing: false,
nil_value: nil,
empty_value: "",
strip: false,
# For generating.
write_headers: nil,
quote_empty: true,
force_quotes: false,
write_converters: nil,
write_nil_value: nil,
write_empty_value: "",
}.freeze
class << self
# :call-seq:
# instance(string, **options)
# instance(io = $stdout, **options)
# instance(string, **options) {|csv| ... }
# instance(io = $stdout, **options) {|csv| ... }
#
# Creates or retrieves cached \CSV objects.
# For arguments and options, see CSV.new.
#
# This API is not Ractor-safe.
#
# ---
#
# With no block given, returns a \CSV object.
#
# The first call to +instance+ creates and caches a \CSV object:
# s0 = 's0'
# csv0 = CSV.instance(s0)
# csv0.class # => CSV
#
# Subsequent calls to +instance+ with that _same_ +string+ or +io+
# retrieve that same cached object:
# csv1 = CSV.instance(s0)
# csv1.class # => CSV
# csv1.equal?(csv0) # => true # Same CSV object
#
# A subsequent call to +instance+ with a _different_ +string+ or +io+
# creates and caches a _different_ \CSV object.
# s1 = 's1'
# csv2 = CSV.instance(s1)
# csv2.equal?(csv0) # => false # Different CSV object
#
# All the cached objects remains available:
# csv3 = CSV.instance(s0)
# csv3.equal?(csv0) # true # Same CSV object
# csv4 = CSV.instance(s1)
# csv4.equal?(csv2) # true # Same CSV object
#
# ---
#
# When a block is given, calls the block with the created or retrieved
# \CSV object; returns the block's return value:
# CSV.instance(s0) {|csv| :foo } # => :foo
def instance(data = $stdout, **options)
# create a _signature_ for this method call, data object and options
sig = [data.object_id] +
options.values_at(*DEFAULT_OPTIONS.keys)
# fetch or create the instance for this signature
@@instances ||= Hash.new
instance = (@@instances[sig] ||= new(data, **options))
if block_given?
yield instance # run block, if given, returning result
else
instance # or return the instance
end
end
# :call-seq:
# filter(in_string_or_io, **options) {|row| ... } -> array_of_arrays or csv_table
# filter(in_string_or_io, out_string_or_io, **options) {|row| ... } -> array_of_arrays or csv_table
# filter(**options) {|row| ... } -> array_of_arrays or csv_table
#
# - Parses \CSV from a source (\String, \IO stream, or ARGF).
# - Calls the given block with each parsed row:
# - Without headers, each row is an \Array.
# - With headers, each row is a CSV::Row.
# - Generates \CSV to an output (\String, \IO stream, or STDOUT).
# - Returns the parsed source:
# - Without headers, an \Array of \Arrays.
# - With headers, a CSV::Table.
#
# When +in_string_or_io+ is given, but not +out_string_or_io+,
# parses from the given +in_string_or_io+
# and generates to STDOUT.
#
# \String input without headers:
#
# in_string = "foo,0\nbar,1\nbaz,2"
# CSV.filter(in_string) do |row|
# row[0].upcase!
# row[1] = - row[1].to_i
# end # => [["FOO", 0], ["BAR", -1], ["BAZ", -2]]
#
# Output (to STDOUT):
#
# FOO,0
# BAR,-1
# BAZ,-2
#
# \String input with headers:
#
# in_string = "Name,Value\nfoo,0\nbar,1\nbaz,2"
# CSV.filter(in_string, headers: true) do |row|
# row[0].upcase!
# row[1] = - row[1].to_i
# end # => #
#
# Output (to STDOUT):
#
# Name,Value
# FOO,0
# BAR,-1
# BAZ,-2
#
# \IO stream input without headers:
#
# File.write('t.csv', "foo,0\nbar,1\nbaz,2")
# File.open('t.csv') do |in_io|
# CSV.filter(in_io) do |row|
# row[0].upcase!
# row[1] = - row[1].to_i
# end
# end # => [["FOO", 0], ["BAR", -1], ["BAZ", -2]]
#
# Output (to STDOUT):
#
# FOO,0
# BAR,-1
# BAZ,-2
#
# \IO stream input with headers:
#
# File.write('t.csv', "Name,Value\nfoo,0\nbar,1\nbaz,2")
# File.open('t.csv') do |in_io|
# CSV.filter(in_io, headers: true) do |row|
# row[0].upcase!
# row[1] = - row[1].to_i
# end
# end # => #
#
# Output (to STDOUT):
#
# Name,Value
# FOO,0
# BAR,-1
# BAZ,-2
#
# When both +in_string_or_io+ and +out_string_or_io+ are given,
# parses from +in_string_or_io+ and generates to +out_string_or_io+.
#
# \String output without headers:
#
# in_string = "foo,0\nbar,1\nbaz,2"
# out_string = ''
# CSV.filter(in_string, out_string) do |row|
# row[0].upcase!
# row[1] = - row[1].to_i
# end # => [["FOO", 0], ["BAR", -1], ["BAZ", -2]]
# out_string # => "FOO,0\nBAR,-1\nBAZ,-2\n"
#
# \String output with headers:
#
# in_string = "Name,Value\nfoo,0\nbar,1\nbaz,2"
# out_string = ''
# CSV.filter(in_string, out_string, headers: true) do |row|
# row[0].upcase!
# row[1] = - row[1].to_i
# end # => #
# out_string # => "Name,Value\nFOO,0\nBAR,-1\nBAZ,-2\n"
#
# \IO stream output without headers:
#
# in_string = "foo,0\nbar,1\nbaz,2"
# File.open('t.csv', 'w') do |out_io|
# CSV.filter(in_string, out_io) do |row|
# row[0].upcase!
# row[1] = - row[1].to_i
# end
# end # => [["FOO", 0], ["BAR", -1], ["BAZ", -2]]
# File.read('t.csv') # => "FOO,0\nBAR,-1\nBAZ,-2\n"
#
# \IO stream output with headers:
#
# in_string = "Name,Value\nfoo,0\nbar,1\nbaz,2"
# File.open('t.csv', 'w') do |out_io|
# CSV.filter(in_string, out_io, headers: true) do |row|
# row[0].upcase!
# row[1] = - row[1].to_i
# end
# end # => #
# File.read('t.csv') # => "Name,Value\nFOO,0\nBAR,-1\nBAZ,-2\n"
#
# When neither +in_string_or_io+ nor +out_string_or_io+ given,
# parses from {ARGF}[rdoc-ref:ARGF]
# and generates to STDOUT.
#
# Without headers:
#
# # Put Ruby code into a file.
# ruby = <<-EOT
# require 'csv'
# CSV.filter do |row|
# row[0].upcase!
# row[1] = - row[1].to_i
# end
# EOT
# File.write('t.rb', ruby)
# # Put some CSV into a file.
# File.write('t.csv', "foo,0\nbar,1\nbaz,2")
# # Run the Ruby code with CSV filename as argument.
# system(Gem.ruby, "t.rb", "t.csv")
#
# Output (to STDOUT):
#
# FOO,0
# BAR,-1
# BAZ,-2
#
# With headers:
#
# # Put Ruby code into a file.
# ruby = <<-EOT
# require 'csv'
# CSV.filter(headers: true) do |row|
# row[0].upcase!
# row[1] = - row[1].to_i
# end
# EOT
# File.write('t.rb', ruby)
# # Put some CSV into a file.
# File.write('t.csv', "Name,Value\nfoo,0\nbar,1\nbaz,2")
# # Run the Ruby code with CSV filename as argument.
# system(Gem.ruby, "t.rb", "t.csv")
#
# Output (to STDOUT):
#
# Name,Value
# FOO,0
# BAR,-1
# BAZ,-2
#
# Arguments:
#
# * Argument +in_string_or_io+ must be a \String or an \IO stream.
# * Argument +out_string_or_io+ must be a \String or an \IO stream.
# * Arguments **options must be keyword options.
#
# - Each option defined as an {option for parsing}[#class-CSV-label-Options+for+Parsing]
# is used for parsing the filter input.
# - Each option defined as an {option for generating}[#class-CSV-label-Options+for+Generating]
# is used for generator the filter input.
#
# However, there are three options that may be used for both parsing and generating:
# +col_sep+, +quote_char+, and +row_sep+.
#
# Therefore for method +filter+ (and method +filter+ only),
# there are special options that allow these parsing and generating options
# to be specified separately:
#
# - Options +input_col_sep+ and +output_col_sep+
# (and their aliases +in_col_sep+ and +out_col_sep+)
# specify the column separators for parsing and generating.
# - Options +input_quote_char+ and +output_quote_char+
# (and their aliases +in_quote_char+ and +out_quote_char+)
# specify the quote characters for parsing and generting.
# - Options +input_row_sep+ and +output_row_sep+
# (and their aliases +in_row_sep+ and +out_row_sep+)
# specify the row separators for parsing and generating.
#
# Example options (for column separators):
#
# CSV.filter # Default for both parsing and generating.
# CSV.filter(in_col_sep: ';') # ';' for parsing, default for generating.
# CSV.filter(out_col_sep: '|') # Default for parsing, '|' for generating.
# CSV.filter(in_col_sep: ';', out_col_sep: '|') # ';' for parsing, '|' for generating.
#
# Note that for a special option (e.g., +input_col_sep+)
# and its corresponding "regular" option (e.g., +col_sep+),
# the two are mutually overriding.
#
# Another example (possibly surprising):
#
# CSV.filter(in_col_sep: ';', col_sep: '|') # '|' for both parsing(!) and generating.
#
def filter(input=nil, output=nil, **options)
# parse options for input, output, or both
in_options, out_options = Hash.new, {row_sep: InputRecordSeparator.value}
options.each do |key, value|
case key
when /\Ain(?:put)?_(.+)\Z/
in_options[$1.to_sym] = value
when /\Aout(?:put)?_(.+)\Z/
out_options[$1.to_sym] = value
else
in_options[key] = value
out_options[key] = value
end
end
# build input and output wrappers
input = new(input || ARGF, **in_options)
output = new(output || $stdout, **out_options)
# process headers
need_manual_header_output =
(in_options[:headers] and
out_options[:headers] == true and
out_options[:write_headers])
if need_manual_header_output
first_row = input.shift
if first_row
if first_row.is_a?(Row)
headers = first_row.headers
yield headers
output << headers
end
yield first_row
output << first_row
end
end
# read, yield, write
input.each do |row|
yield row
output << row
end
end
#
# :call-seq:
# foreach(path_or_io, mode='r', **options) {|row| ... )
# foreach(path_or_io, mode='r', **options) -> new_enumerator
#
# Calls the block with each row read from source +path_or_io+.
#
# \Path input without headers:
#
# string = "foo,0\nbar,1\nbaz,2\n"
# in_path = 't.csv'
# File.write(in_path, string)
# CSV.foreach(in_path) {|row| p row }
#
# Output:
#
# ["foo", "0"]
# ["bar", "1"]
# ["baz", "2"]
#
# \Path input with headers:
#
# string = "Name,Value\nfoo,0\nbar,1\nbaz,2\n"
# in_path = 't.csv'
# File.write(in_path, string)
# CSV.foreach(in_path, headers: true) {|row| p row }
#
# Output:
#
#
#
#
#
# \IO stream input without headers:
#
# string = "foo,0\nbar,1\nbaz,2\n"
# path = 't.csv'
# File.write(path, string)
# File.open('t.csv') do |in_io|
# CSV.foreach(in_io) {|row| p row }
# end
#
# Output:
#
# ["foo", "0"]
# ["bar", "1"]
# ["baz", "2"]
#
# \IO stream input with headers:
#
# string = "Name,Value\nfoo,0\nbar,1\nbaz,2\n"
# path = 't.csv'
# File.write(path, string)
# File.open('t.csv') do |in_io|
# CSV.foreach(in_io, headers: true) {|row| p row }
# end
#
# Output:
#
#
#
#
#
# With no block given, returns an \Enumerator:
#
# string = "foo,0\nbar,1\nbaz,2\n"
# path = 't.csv'
# File.write(path, string)
# CSV.foreach(path) # => #
#
# Arguments:
# * Argument +path_or_io+ must be a file path or an \IO stream.
# * Argument +mode+, if given, must be a \File mode.
# See {Access Modes}[https://docs.ruby-lang.org/en/master/File.html#class-File-label-Access+Modes].
# * Arguments **options must be keyword options.
# See {Options for Parsing}[#class-CSV-label-Options+for+Parsing].
# * This method optionally accepts an additional :encoding option
# that you can use to specify the Encoding of the data read from +path+ or +io+.
# You must provide this unless your data is in the encoding
# given by Encoding::default_external.
# Parsing will use this to determine how to parse the data.
# You may provide a second Encoding to
# have the data transcoded as it is read. For example,
# encoding: 'UTF-32BE:UTF-8'
# would read +UTF-32BE+ data from the file
# but transcode it to +UTF-8+ before parsing.
def foreach(path, mode="r", **options, &block)
return to_enum(__method__, path, mode, **options) unless block_given?
open(path, mode, **options) do |csv|
csv.each(&block)
end
end
#
# :call-seq:
# generate(csv_string, **options) {|csv| ... }
# generate(**options) {|csv| ... }
#
# * Argument +csv_string+, if given, must be a \String object;
# defaults to a new empty \String.
# * Arguments +options+, if given, should be generating options.
# See {Options for Generating}[#class-CSV-label-Options+for+Generating].
#
# ---
#
# Creates a new \CSV object via CSV.new(csv_string, **options);
# calls the block with the \CSV object, which the block may modify;
# returns the \String generated from the \CSV object.
#
# Note that a passed \String *is* modified by this method.
# Pass csv_string.dup if the \String must be preserved.
#
# This method has one additional option: :encoding,
# which sets the base Encoding for the output if no no +str+ is specified.
# CSV needs this hint if you plan to output non-ASCII compatible data.
#
# ---
#
# Add lines:
# input_string = "foo,0\nbar,1\nbaz,2\n"
# output_string = CSV.generate(input_string) do |csv|
# csv << ['bat', 3]
# csv << ['bam', 4]
# end
# output_string # => "foo,0\nbar,1\nbaz,2\nbat,3\nbam,4\n"
# input_string # => "foo,0\nbar,1\nbaz,2\nbat,3\nbam,4\n"
# output_string.equal?(input_string) # => true # Same string, modified
#
# Add lines into new string, preserving old string:
# input_string = "foo,0\nbar,1\nbaz,2\n"
# output_string = CSV.generate(input_string.dup) do |csv|
# csv << ['bat', 3]
# csv << ['bam', 4]
# end
# output_string # => "foo,0\nbar,1\nbaz,2\nbat,3\nbam,4\n"
# input_string # => "foo,0\nbar,1\nbaz,2\n"
# output_string.equal?(input_string) # => false # Different strings
#
# Create lines from nothing:
# output_string = CSV.generate do |csv|
# csv << ['foo', 0]
# csv << ['bar', 1]
# csv << ['baz', 2]
# end
# output_string # => "foo,0\nbar,1\nbaz,2\n"
#
# ---
#
# Raises an exception if +csv_string+ is not a \String object:
# # Raises TypeError (no implicit conversion of Integer into String)
# CSV.generate(0)
#
def generate(str=nil, **options)
encoding = options[:encoding]
# add a default empty String, if none was given
if str
str = StringIO.new(str)
str.seek(0, IO::SEEK_END)
str.set_encoding(encoding) if encoding
else
str = +""
str.force_encoding(encoding) if encoding
end
csv = new(str, **options) # wrap
yield csv # yield for appending
csv.string # return final String
end
# :call-seq:
# CSV.generate_line(ary)
# CSV.generate_line(ary, **options)
#
# Returns the \String created by generating \CSV from +ary+
# using the specified +options+.
#
# Argument +ary+ must be an \Array.
#
# Special options:
# * Option :row_sep defaults to "\n"> on Ruby 3.0 or later
# and $INPUT_RECORD_SEPARATOR ($/) otherwise.:
# $INPUT_RECORD_SEPARATOR # => "\n"
# * This method accepts an additional option, :encoding, which sets the base
# Encoding for the output. This method will try to guess your Encoding from
# the first non-+nil+ field in +row+, if possible, but you may need to use
# this parameter as a backup plan.
#
# For other +options+,
# see {Options for Generating}[#class-CSV-label-Options+for+Generating].
#
# ---
#
# Returns the \String generated from an \Array:
# CSV.generate_line(['foo', '0']) # => "foo,0\n"
#
# ---
#
# Raises an exception if +ary+ is not an \Array:
# # Raises NoMethodError (undefined method `find' for :foo:Symbol)
# CSV.generate_line(:foo)
#
def generate_line(row, **options)
options = {row_sep: InputRecordSeparator.value}.merge(options)
str = +""
if options[:encoding]
str.force_encoding(options[:encoding])
else
fallback_encoding = nil
output_encoding = nil
row.each do |field|
next unless field.is_a?(String)
fallback_encoding ||= field.encoding
next if field.ascii_only?
output_encoding = field.encoding
break
end
output_encoding ||= fallback_encoding
if output_encoding
str.force_encoding(output_encoding)
end
end
(new(str, **options) << row).string
end
# :call-seq:
# CSV.generate_lines(rows)
# CSV.generate_lines(rows, **options)
#
# Returns the \String created by generating \CSV from
# using the specified +options+.
#
# Argument +rows+ must be an \Array of row. Row is \Array of \String or \CSV::Row.
#
# Special options:
# * Option :row_sep defaults to "\n" on Ruby 3.0 or later
# and $INPUT_RECORD_SEPARATOR ($/) otherwise.:
# $INPUT_RECORD_SEPARATOR # => "\n"
# * This method accepts an additional option, :encoding, which sets the base
# Encoding for the output. This method will try to guess your Encoding from
# the first non-+nil+ field in +row+, if possible, but you may need to use
# this parameter as a backup plan.
#
# For other +options+,
# see {Options for Generating}[#class-CSV-label-Options+for+Generating].
#
# ---
#
# Returns the \String generated from an
# CSV.generate_lines([['foo', '0'], ['bar', '1'], ['baz', '2']]) # => "foo,0\nbar,1\nbaz,2\n"
#
# ---
#
# Raises an exception
# # Raises NoMethodError (undefined method `each' for :foo:Symbol)
# CSV.generate_lines(:foo)
#
def generate_lines(rows, **options)
self.generate(**options) do |csv|
rows.each do |row|
csv << row
end
end
end
#
# :call-seq:
# open(path_or_io, mode = "rb", **options ) -> new_csv
# open(path_or_io, mode = "rb", **options ) { |csv| ... } -> object
#
# possible options elements:
# keyword form:
# :invalid => nil # raise error on invalid byte sequence (default)
# :invalid => :replace # replace invalid byte sequence
# :undef => :replace # replace undefined conversion
# :replace => string # replacement string ("?" or "\uFFFD" if not specified)
#
# * Argument +path_or_io+, must be a file path or an \IO stream.
# :include: ../doc/csv/arguments/io.rdoc
# * Argument +mode+, if given, must be a \File mode.
# See {Access Modes}[https://docs.ruby-lang.org/en/master/File.html#class-File-label-Access+Modes].
# * Arguments **options must be keyword options.
# See {Options for Generating}[#class-CSV-label-Options+for+Generating].
# * This method optionally accepts an additional :encoding option
# that you can use to specify the Encoding of the data read from +path+ or +io+.
# You must provide this unless your data is in the encoding
# given by Encoding::default_external.
# Parsing will use this to determine how to parse the data.
# You may provide a second Encoding to
# have the data transcoded as it is read. For example,
# encoding: 'UTF-32BE:UTF-8'
# would read +UTF-32BE+ data from the file
# but transcode it to +UTF-8+ before parsing.
#
# ---
#
# These examples assume prior execution of:
# string = "foo,0\nbar,1\nbaz,2\n"
# path = 't.csv'
# File.write(path, string)
#
# string_io = StringIO.new
# string_io << "foo,0\nbar,1\nbaz,2\n"
#
# ---
#
# With no block given, returns a new \CSV object.
#
# Create a \CSV object using a file path:
# csv = CSV.open(path)
# csv # => #
#
# Create a \CSV object using an open \File:
# csv = CSV.open(File.open(path))
# csv # => #
#
# Create a \CSV object using a \StringIO:
# csv = CSV.open(string_io)
# csv # => #
# ---
#
# With a block given, calls the block with the created \CSV object;
# returns the block's return value:
#
# Using a file path:
# csv = CSV.open(path) {|csv| p csv}
# csv # => #
# Output:
# #
#
# Using an open \File:
# csv = CSV.open(File.open(path)) {|csv| p csv}
# csv # => #
# Output:
# #
#
# Using a \StringIO:
# csv = CSV.open(string_io) {|csv| p csv}
# csv # => #
# Output:
# #
# ---
#
# Raises an exception if the argument is not a \String object or \IO object:
# # Raises TypeError (no implicit conversion of Symbol into String)
# CSV.open(:foo)
def open(filename_or_io, mode="r", **options)
# wrap a File opened with the remaining +args+ with no newline
# decorator
file_opts = {}
may_enable_bom_detection_automatically(filename_or_io,
mode,
options,
file_opts)
file_opts.merge!(options)
unless file_opts.key?(:newline)
file_opts[:universal_newline] ||= false
end
options.delete(:invalid)
options.delete(:undef)
options.delete(:replace)
options.delete_if {|k, _| /newline\z/.match?(k)}
if filename_or_io.is_a?(StringIO)
f = create_stringio(filename_or_io.string, mode, **file_opts)
else
begin
f = File.open(filename_or_io, mode, **file_opts)
rescue ArgumentError => e
raise unless /needs binmode/.match?(e.message) and mode == "r"
mode = "rb"
file_opts = {encoding: Encoding.default_external}.merge(file_opts)
retry
end
end
begin
csv = new(f, **options)
rescue Exception
f.close
raise
end
# handle blocks like Ruby's open(), not like the CSV library
if block_given?
begin
yield csv
ensure
csv.close
end
else
csv
end
end
#
# :call-seq:
# parse(string) -> array_of_arrays
# parse(io) -> array_of_arrays
# parse(string, headers: ..., **options) -> csv_table
# parse(io, headers: ..., **options) -> csv_table
# parse(string, **options) {|row| ... }
# parse(io, **options) {|row| ... }
#
# Parses +string+ or +io+ using the specified +options+.
#
# - Argument +string+ should be a \String object;
# it will be put into a new StringIO object positioned at the beginning.
# :include: ../doc/csv/arguments/io.rdoc
# - Argument +options+: see {Options for Parsing}[#class-CSV-label-Options+for+Parsing]
#
# ====== Without Option +headers+
#
# Without {option +headers+}[#class-CSV-label-Option+headers] case.
#
# These examples assume prior execution of:
# string = "foo,0\nbar,1\nbaz,2\n"
# path = 't.csv'
# File.write(path, string)
#
# ---
#
# With no block given, returns an \Array of Arrays formed from the source.
#
# Parse a \String:
# a_of_a = CSV.parse(string)
# a_of_a # => [["foo", "0"], ["bar", "1"], ["baz", "2"]]
#
# Parse an open \File:
# a_of_a = File.open(path) do |file|
# CSV.parse(file)
# end
# a_of_a # => [["foo", "0"], ["bar", "1"], ["baz", "2"]]
#
# ---
#
# With a block given, calls the block with each parsed row:
#
# Parse a \String:
# CSV.parse(string) {|row| p row }
#
# Output:
# ["foo", "0"]
# ["bar", "1"]
# ["baz", "2"]
#
# Parse an open \File:
# File.open(path) do |file|
# CSV.parse(file) {|row| p row }
# end
#
# Output:
# ["foo", "0"]
# ["bar", "1"]
# ["baz", "2"]
#
# ====== With Option +headers+
#
# With {option +headers+}[#class-CSV-label-Option+headers] case.
#
# These examples assume prior execution of:
# string = "Name,Count\nfoo,0\nbar,1\nbaz,2\n"
# path = 't.csv'
# File.write(path, string)
#
# ---
#
# With no block given, returns a CSV::Table object formed from the source.
#
# Parse a \String:
# csv_table = CSV.parse(string, headers: ['Name', 'Count'])
# csv_table # => #
#
# Parse an open \File:
# csv_table = File.open(path) do |file|
# CSV.parse(file, headers: ['Name', 'Count'])
# end
# csv_table # => #
#
# ---
#
# With a block given, calls the block with each parsed row,
# which has been formed into a CSV::Row object:
#
# Parse a \String:
# CSV.parse(string, headers: ['Name', 'Count']) {|row| p row }
#
# Output:
# #
# #
# #
#
# Parse an open \File:
# File.open(path) do |file|
# CSV.parse(file, headers: ['Name', 'Count']) {|row| p row }
# end
#
# Output:
# #
# #
# #
#
# ---
#
# Raises an exception if the argument is not a \String object or \IO object:
# # Raises NoMethodError (undefined method `close' for :foo:Symbol)
# CSV.parse(:foo)
#
# ---
#
# Please make sure if your text contains \BOM or not. CSV.parse will not remove
# \BOM automatically. You might want to remove \BOM before calling CSV.parse :
# # remove BOM on calling File.open
# File.open(path, encoding: 'bom|utf-8') do |file|
# CSV.parse(file, headers: true) do |row|
# # you can get value by column name because BOM is removed
# p row['Name']
# end
# end
#
# Output:
# # "foo"
# # "bar"
# # "baz"
def parse(str, **options, &block)
csv = new(str, **options)
return csv.each(&block) if block_given?
# slurp contents, if no block is given
begin
csv.read
ensure
csv.close
end
end
# :call-seq:
# CSV.parse_line(string) -> new_array or nil
# CSV.parse_line(io) -> new_array or nil
# CSV.parse_line(string, **options) -> new_array or nil
# CSV.parse_line(io, **options) -> new_array or nil
# CSV.parse_line(string, headers: true, **options) -> csv_row or nil
# CSV.parse_line(io, headers: true, **options) -> csv_row or nil
#
# Returns the data created by parsing the first line of +string+ or +io+
# using the specified +options+.
#
# - Argument +string+ should be a \String object;
# it will be put into a new StringIO object positioned at the beginning.
# :include: ../doc/csv/arguments/io.rdoc
# - Argument +options+: see {Options for Parsing}[#class-CSV-label-Options+for+Parsing]
#
# ====== Without Option +headers+
#
# Without option +headers+, returns the first row as a new \Array.
#
# These examples assume prior execution of:
# string = "foo,0\nbar,1\nbaz,2\n"
# path = 't.csv'
# File.write(path, string)
#
# Parse the first line from a \String object:
# CSV.parse_line(string) # => ["foo", "0"]
#
# Parse the first line from a File object:
# File.open(path) do |file|
# CSV.parse_line(file) # => ["foo", "0"]
# end # => ["foo", "0"]
#
# Returns +nil+ if the argument is an empty \String:
# CSV.parse_line('') # => nil
#
# ====== With Option +headers+
#
# With {option +headers+}[#class-CSV-label-Option+headers],
# returns the first row as a CSV::Row object.
#
# These examples assume prior execution of:
# string = "Name,Count\nfoo,0\nbar,1\nbaz,2\n"
# path = 't.csv'
# File.write(path, string)
#
# Parse the first line from a \String object:
# CSV.parse_line(string, headers: true) # => #
#
# Parse the first line from a File object:
# File.open(path) do |file|
# CSV.parse_line(file, headers: true)
# end # => #
#
# ---
#
# Raises an exception if the argument is +nil+:
# # Raises ArgumentError (Cannot parse nil as CSV):
# CSV.parse_line(nil)
#
def parse_line(line, **options)
new(line, **options).each.first
end
#
# :call-seq:
# read(source, **options) -> array_of_arrays
# read(source, headers: true, **options) -> csv_table
#
# Opens the given +source+ with the given +options+ (see CSV.open),
# reads the source (see CSV#read), and returns the result,
# which will be either an \Array of Arrays or a CSV::Table.
#
# Without headers:
# string = "foo,0\nbar,1\nbaz,2\n"
# path = 't.csv'
# File.write(path, string)
# CSV.read(path) # => [["foo", "0"], ["bar", "1"], ["baz", "2"]]
#
# With headers:
# string = "Name,Value\nfoo,0\nbar,1\nbaz,2\n"
# path = 't.csv'
# File.write(path, string)
# CSV.read(path, headers: true) # => #
def read(path, **options)
open(path, **options) { |csv| csv.read }
end
# :call-seq:
# CSV.readlines(source, **options)
#
# Alias for CSV.read.
def readlines(path, **options)
read(path, **options)
end
# :call-seq:
# CSV.table(source, **options)
#
# Calls CSV.read with +source+, +options+, and certain default options:
# - +headers+: +true+
# - +converters+: +:numeric+
# - +header_converters+: +:symbol+
#
# Returns a CSV::Table object.
#
# Example:
# string = "Name,Value\nfoo,0\nbar,1\nbaz,2\n"
# path = 't.csv'
# File.write(path, string)
# CSV.table(path) # => #
def table(path, **options)
default_options = {
headers: true,
converters: :numeric,
header_converters: :symbol,
}
options = default_options.merge(options)
read(path, **options)
end
ON_WINDOWS = /mingw|mswin/.match?(RUBY_PLATFORM)
private_constant :ON_WINDOWS
private
def may_enable_bom_detection_automatically(filename_or_io,
mode,
options,
file_opts)
if filename_or_io.is_a?(StringIO)
# Support to StringIO was dropped for Ruby 2.6 and earlier without BOM support:
# https://github.com/ruby/stringio/pull/47
return if RUBY_VERSION < "2.7"
else
# "bom|utf-8" may be buggy on Windows:
# https://bugs.ruby-lang.org/issues/20526
return if ON_WINDOWS
end
return unless Encoding.default_external == Encoding::UTF_8
return if options.key?(:encoding)
return if options.key?(:external_encoding)
return if mode.is_a?(String) and mode.include?(":")
file_opts[:encoding] = "bom|utf-8"
end
if RUBY_VERSION < "2.7"
def create_stringio(str, mode, opts)
opts.delete_if {|k, _| k == :universal_newline or DEFAULT_OPTIONS.key?(k)}
raise ArgumentError, "Unsupported options parsing StringIO: #{opts.keys}" unless opts.empty?
StringIO.new(str, mode)
end
else
def create_stringio(str, mode, opts)
StringIO.new(str, mode, **opts)
end
end
end
# :call-seq:
# CSV.new(string)
# CSV.new(io)
# CSV.new(string, **options)
# CSV.new(io, **options)
#
# Returns the new \CSV object created using +string+ or +io+
# and the specified +options+.
#
# - Argument +string+ should be a \String object;
# it will be put into a new StringIO object positioned at the beginning.
# :include: ../doc/csv/arguments/io.rdoc
# - Argument +options+: See:
# * {Options for Parsing}[#class-CSV-label-Options+for+Parsing]
# * {Options for Generating}[#class-CSV-label-Options+for+Generating]
# For performance reasons, the options cannot be overridden
# in a \CSV object, so those specified here will endure.
#
# In addition to the \CSV instance methods, several \IO methods are delegated.
# See {Delegated Methods}[#class-CSV-label-Delegated+Methods].
#
# ---
#
# Create a \CSV object from a \String object:
# csv = CSV.new('foo,0')
# csv # => #
#
# Create a \CSV object from a \File object:
# File.write('t.csv', 'foo,0')
# csv = CSV.new(File.open('t.csv'))
# csv # => #
#
# ---
#
# Raises an exception if the argument is +nil+:
# # Raises ArgumentError (Cannot parse nil as CSV):
# CSV.new(nil)
#
def initialize(data,
col_sep: ",",
row_sep: :auto,
quote_char: '"',
field_size_limit: nil,
max_field_size: nil,
converters: nil,
unconverted_fields: nil,
headers: false,
return_headers: false,
write_headers: nil,
header_converters: nil,
skip_blanks: false,
force_quotes: false,
skip_lines: nil,
liberal_parsing: false,
internal_encoding: nil,
external_encoding: nil,
encoding: nil,
nil_value: nil,
empty_value: "",
strip: false,
quote_empty: true,
write_converters: nil,
write_nil_value: nil,
write_empty_value: "")
raise ArgumentError.new("Cannot parse nil as CSV") if data.nil?
if data.is_a?(String)
if encoding
if encoding.is_a?(String)
data_external_encoding, data_internal_encoding = encoding.split(":", 2)
if data_internal_encoding
data = data.encode(data_internal_encoding, data_external_encoding)
else
data = data.dup.force_encoding(data_external_encoding)
end
else
data = data.dup.force_encoding(encoding)
end
end
@io = StringIO.new(data)
else
@io = data
end
@encoding = determine_encoding(encoding, internal_encoding)
@base_fields_converter_options = {
nil_value: nil_value,
empty_value: empty_value,
}
@write_fields_converter_options = {
nil_value: write_nil_value,
empty_value: write_empty_value,
}
@initial_converters = converters
@initial_header_converters = header_converters
@initial_write_converters = write_converters
if max_field_size.nil? and field_size_limit
max_field_size = field_size_limit - 1
end
@parser_options = {
column_separator: col_sep,
row_separator: row_sep,
quote_character: quote_char,
max_field_size: max_field_size,
unconverted_fields: unconverted_fields,
headers: headers,
return_headers: return_headers,
skip_blanks: skip_blanks,
skip_lines: skip_lines,
liberal_parsing: liberal_parsing,
encoding: @encoding,
nil_value: nil_value,
empty_value: empty_value,
strip: strip,
}
@parser = nil
@parser_enumerator = nil
@eof_error = nil
@writer_options = {
encoding: @encoding,
force_encoding: (not encoding.nil?),
force_quotes: force_quotes,
headers: headers,
write_headers: write_headers,
column_separator: col_sep,
row_separator: row_sep,
quote_character: quote_char,
quote_empty: quote_empty,
}
@writer = nil
writer if @writer_options[:write_headers]
end
class TSV < CSV
def initialize(data, **options)
super(data, **({col_sep: "\t"}.merge(options)))
end
end
# :call-seq:
# csv.col_sep -> string
#
# Returns the encoded column separator; used for parsing and writing;
# see {Option +col_sep+}[#class-CSV-label-Option+col_sep]:
# CSV.new('').col_sep # => ","
def col_sep
parser.column_separator
end
# :call-seq:
# csv.row_sep -> string
#
# Returns the encoded row separator; used for parsing and writing;
# see {Option +row_sep+}[#class-CSV-label-Option+row_sep]:
# CSV.new('').row_sep # => "\n"
def row_sep
parser.row_separator
end
# :call-seq:
# csv.quote_char -> character
#
# Returns the encoded quote character; used for parsing and writing;
# see {Option +quote_char+}[#class-CSV-label-Option+quote_char]:
# CSV.new('').quote_char # => "\""
def quote_char
parser.quote_character
end
# :call-seq:
# csv.field_size_limit -> integer or nil
#
# Returns the limit for field size; used for parsing;
# see {Option +field_size_limit+}[#class-CSV-label-Option+field_size_limit]:
# CSV.new('').field_size_limit # => nil
#
# Deprecated since 3.2.3. Use +max_field_size+ instead.
def field_size_limit
parser.field_size_limit
end
# :call-seq:
# csv.max_field_size -> integer or nil
#
# Returns the limit for field size; used for parsing;
# see {Option +max_field_size+}[#class-CSV-label-Option+max_field_size]:
# CSV.new('').max_field_size # => nil
#
# Since 3.2.3.
def max_field_size
parser.max_field_size
end
# :call-seq:
# csv.skip_lines -> regexp or nil
#
# Returns the \Regexp used to identify comment lines; used for parsing;
# see {Option +skip_lines+}[#class-CSV-label-Option+skip_lines]:
# CSV.new('').skip_lines # => nil
def skip_lines
parser.skip_lines
end
# :call-seq:
# csv.converters -> array
#
# Returns an \Array containing field converters;
# see {Field Converters}[#class-CSV-label-Field+Converters]:
# csv = CSV.new('')
# csv.converters # => []
# csv.convert(:integer)
# csv.converters # => [:integer]
# csv.convert(proc {|x| x.to_s })
# csv.converters
#
# Notes that you need to call
# +Ractor.make_shareable(CSV::Converters)+ on the main Ractor to use
# this method.
def converters
parser_fields_converter.map do |converter|
name = Converters.rassoc(converter)
name ? name.first : converter
end
end
# :call-seq:
# csv.unconverted_fields? -> object
#
# Returns the value that determines whether unconverted fields are to be
# available; used for parsing;
# see {Option +unconverted_fields+}[#class-CSV-label-Option+unconverted_fields]:
# CSV.new('').unconverted_fields? # => nil
def unconverted_fields?
parser.unconverted_fields?
end
# :call-seq:
# csv.headers -> object
#
# Returns the value that determines whether headers are used; used for parsing;
# see {Option +headers+}[#class-CSV-label-Option+headers]:
# CSV.new('').headers # => nil
def headers
if @writer
@writer.headers
else
parsed_headers = parser.headers
return parsed_headers if parsed_headers
raw_headers = @parser_options[:headers]
raw_headers = nil if raw_headers == false
raw_headers
end
end
# :call-seq:
# csv.return_headers? -> true or false
#
# Returns the value that determines whether headers are to be returned; used for parsing;
# see {Option +return_headers+}[#class-CSV-label-Option+return_headers]:
# CSV.new('').return_headers? # => false
def return_headers?
parser.return_headers?
end
# :call-seq:
# csv.write_headers? -> true or false
#
# Returns the value that determines whether headers are to be written; used for generating;
# see {Option +write_headers+}[#class-CSV-label-Option+write_headers]:
# CSV.new('').write_headers? # => nil
def write_headers?
@writer_options[:write_headers]
end
# :call-seq:
# csv.header_converters -> array
#
# Returns an \Array containing header converters; used for parsing;
# see {Header Converters}[#class-CSV-label-Header+Converters]:
# CSV.new('').header_converters # => []
#
# Notes that you need to call
# +Ractor.make_shareable(CSV::HeaderConverters)+ on the main Ractor
# to use this method.
def header_converters
header_fields_converter.map do |converter|
name = HeaderConverters.rassoc(converter)
name ? name.first : converter
end
end
# :call-seq:
# csv.skip_blanks? -> true or false
#
# Returns the value that determines whether blank lines are to be ignored; used for parsing;
# see {Option +skip_blanks+}[#class-CSV-label-Option+skip_blanks]:
# CSV.new('').skip_blanks? # => false
def skip_blanks?
parser.skip_blanks?
end
# :call-seq:
# csv.force_quotes? -> true or false
#
# Returns the value that determines whether all output fields are to be quoted;
# used for generating;
# see {Option +force_quotes+}[#class-CSV-label-Option+force_quotes]:
# CSV.new('').force_quotes? # => false
def force_quotes?
@writer_options[:force_quotes]
end
# :call-seq:
# csv.liberal_parsing? -> true or false
#
# Returns the value that determines whether illegal input is to be handled; used for parsing;
# see {Option +liberal_parsing+}[#class-CSV-label-Option+liberal_parsing]:
# CSV.new('').liberal_parsing? # => false
def liberal_parsing?
parser.liberal_parsing?
end
# :call-seq:
# csv.encoding -> encoding
#
# Returns the encoding used for parsing and generating;
# see {Character Encodings (M17n or Multilingualization)}[#class-CSV-label-Character+Encodings+-28M17n+or+Multilingualization-29]:
# CSV.new('').encoding # => #
attr_reader :encoding
# :call-seq:
# csv.line_no -> integer
#
# Returns the count of the rows parsed or generated.
#
# Parsing:
# string = "foo,0\nbar,1\nbaz,2\n"
# path = 't.csv'
# File.write(path, string)
# CSV.open(path) do |csv|
# csv.each do |row|
# p [csv.lineno, row]
# end
# end
# Output:
# [1, ["foo", "0"]]
# [2, ["bar", "1"]]
# [3, ["baz", "2"]]
#
# Generating:
# CSV.generate do |csv|
# p csv.lineno; csv << ['foo', 0]
# p csv.lineno; csv << ['bar', 1]
# p csv.lineno; csv << ['baz', 2]
# end
# Output:
# 0
# 1
# 2
def lineno
if @writer
@writer.lineno
else
parser.lineno
end
end
# :call-seq:
# csv.line -> array
#
# Returns the line most recently read:
# string = "foo,0\nbar,1\nbaz,2\n"
# path = 't.csv'
# File.write(path, string)
# CSV.open(path) do |csv|
# csv.each do |row|
# p [csv.lineno, csv.line]
# end
# end
# Output:
# [1, "foo,0\n"]
# [2, "bar,1\n"]
# [3, "baz,2\n"]
def line
parser.line
end
### IO and StringIO Delegation ###
extend Forwardable
def_delegators :@io, :binmode, :close, :close_read, :close_write,
:closed?, :external_encoding, :fcntl,
:fileno, :flush, :fsync, :internal_encoding,
:isatty, :pid, :pos, :pos=, :reopen,
:seek, :string, :sync, :sync=, :tell,
:truncate, :tty?
def binmode?
if @io.respond_to?(:binmode?)
@io.binmode?
else
false
end
end
def flock(*args)
raise NotImplementedError unless @io.respond_to?(:flock)
@io.flock(*args)
end
def ioctl(*args)
raise NotImplementedError unless @io.respond_to?(:ioctl)
@io.ioctl(*args)
end
def path
@io.path if @io.respond_to?(:path)
end
def stat(*args)
raise NotImplementedError unless @io.respond_to?(:stat)
@io.stat(*args)
end
def to_i
raise NotImplementedError unless @io.respond_to?(:to_i)
@io.to_i
end
def to_io
@io.respond_to?(:to_io) ? @io.to_io : @io
end
def eof?
return false if @eof_error
begin
parser_enumerator.peek
false
rescue MalformedCSVError => error
@eof_error = error
false
rescue StopIteration
true
end
end
alias_method :eof, :eof?
# Rewinds the underlying IO object and resets CSV's lineno() counter.
def rewind
@parser = nil
@parser_enumerator = nil
@eof_error = nil
@writer.rewind if @writer
@io.rewind
end
### End Delegation ###
# :call-seq:
# csv << row -> self
#
# Appends a row to +self+.
#
# - Argument +row+ must be an \Array object or a CSV::Row object.
# - The output stream must be open for writing.
#
# ---
#
# Append Arrays:
# CSV.generate do |csv|
# csv << ['foo', 0]
# csv << ['bar', 1]
# csv << ['baz', 2]
# end # => "foo,0\nbar,1\nbaz,2\n"
#
# Append CSV::Rows:
# headers = []
# CSV.generate do |csv|
# csv << CSV::Row.new(headers, ['foo', 0])
# csv << CSV::Row.new(headers, ['bar', 1])
# csv << CSV::Row.new(headers, ['baz', 2])
# end # => "foo,0\nbar,1\nbaz,2\n"
#
# Headers in CSV::Row objects are not appended:
# headers = ['Name', 'Count']
# CSV.generate do |csv|
# csv << CSV::Row.new(headers, ['foo', 0])
# csv << CSV::Row.new(headers, ['bar', 1])
# csv << CSV::Row.new(headers, ['baz', 2])
# end # => "foo,0\nbar,1\nbaz,2\n"
#
# ---
#
# Raises an exception if +row+ is not an \Array or \CSV::Row:
# CSV.generate do |csv|
# # Raises NoMethodError (undefined method `collect' for :foo:Symbol)
# csv << :foo
# end
#
# Raises an exception if the output stream is not opened for writing:
# path = 't.csv'
# File.write(path, '')
# File.open(path) do |file|
# CSV.open(file) do |csv|
# # Raises IOError (not opened for writing)
# csv << ['foo', 0]
# end
# end
def <<(row)
writer << row
self
end
alias_method :add_row, :<<
alias_method :puts, :<<
# :call-seq:
# convert(converter_name) -> array_of_procs
# convert {|field, field_info| ... } -> array_of_procs
#
# - With no block, installs a field converter (a \Proc).
# - With a block, defines and installs a custom field converter.
# - Returns the \Array of installed field converters.
#
# - Argument +converter_name+, if given, should be the name
# of an existing field converter.
#
# See {Field Converters}[#class-CSV-label-Field+Converters].
# ---
#
# With no block, installs a field converter:
# csv = CSV.new('')
# csv.convert(:integer)
# csv.convert(:float)
# csv.convert(:date)
# csv.converters # => [:integer, :float, :date]
#
# ---
#
# The block, if given, is called for each field:
# - Argument +field+ is the field value.
# - Argument +field_info+ is a CSV::FieldInfo object
# containing details about the field.
#
# The examples here assume the prior execution of:
# string = "foo,0\nbar,1\nbaz,2\n"
# path = 't.csv'
# File.write(path, string)
#
# Example giving a block:
# csv = CSV.open(path)
# csv.convert {|field, field_info| p [field, field_info]; field.upcase }
# csv.read # => [["FOO", "0"], ["BAR", "1"], ["BAZ", "2"]]
#
# Output:
# ["foo", #]
# ["0", #]
# ["bar", #]
# ["1", #]
# ["baz", #]
# ["2", #]
#
# The block need not return a \String object:
# csv = CSV.open(path)
# csv.convert {|field, field_info| field.to_sym }
# csv.read # => [[:foo, :"0"], [:bar, :"1"], [:baz, :"2"]]
#
# If +converter_name+ is given, the block is not called:
# csv = CSV.open(path)
# csv.convert(:integer) {|field, field_info| fail 'Cannot happen' }
# csv.read # => [["foo", 0], ["bar", 1], ["baz", 2]]
#
# ---
#
# Raises a parse-time exception if +converter_name+ is not the name of a built-in
# field converter:
# csv = CSV.open(path)
# csv.convert(:nosuch) => [nil]
# # Raises NoMethodError (undefined method `arity' for nil:NilClass)
# csv.read
def convert(name = nil, &converter)
parser_fields_converter.add_converter(name, &converter)
end
# :call-seq:
# header_convert(converter_name) -> array_of_procs
# header_convert {|header, field_info| ... } -> array_of_procs
#
# - With no block, installs a header converter (a \Proc).
# - With a block, defines and installs a custom header converter.
# - Returns the \Array of installed header converters.
#
# - Argument +converter_name+, if given, should be the name
# of an existing header converter.
#
# See {Header Converters}[#class-CSV-label-Header+Converters].
# ---
#
# With no block, installs a header converter:
# csv = CSV.new('')
# csv.header_convert(:symbol)
# csv.header_convert(:downcase)
# csv.header_converters # => [:symbol, :downcase]
#
# ---
#
# The block, if given, is called for each header:
# - Argument +header+ is the header value.
# - Argument +field_info+ is a CSV::FieldInfo object
# containing details about the header.
#
# The examples here assume the prior execution of:
# string = "Name,Value\nfoo,0\nbar,1\nbaz,2\n"
# path = 't.csv'
# File.write(path, string)
#
# Example giving a block:
# csv = CSV.open(path, headers: true)
# csv.header_convert {|header, field_info| p [header, field_info]; header.upcase }
# table = csv.read
# table # => #
# table.headers # => ["NAME", "VALUE"]
#
# Output:
# ["Name", #]
# ["Value", #]
# The block need not return a \String object:
# csv = CSV.open(path, headers: true)
# csv.header_convert {|header, field_info| header.to_sym }
# table = csv.read
# table.headers # => [:Name, :Value]
#
# If +converter_name+ is given, the block is not called:
# csv = CSV.open(path, headers: true)
# csv.header_convert(:downcase) {|header, field_info| fail 'Cannot happen' }
# table = csv.read
# table.headers # => ["name", "value"]
# ---
#
# Raises a parse-time exception if +converter_name+ is not the name of a built-in
# field converter:
# csv = CSV.open(path, headers: true)
# csv.header_convert(:nosuch)
# # Raises NoMethodError (undefined method `arity' for nil:NilClass)
# csv.read
def header_convert(name = nil, &converter)
header_fields_converter.add_converter(name, &converter)
end
include Enumerable
# :call-seq:
# csv.each -> enumerator
# csv.each {|row| ...}
#
# Calls the block with each successive row.
# The data source must be opened for reading.
#
# Without headers:
# string = "foo,0\nbar,1\nbaz,2\n"
# csv = CSV.new(string)
# csv.each do |row|
# p row
# end
# Output:
# ["foo", "0"]
# ["bar", "1"]
# ["baz", "2"]
#
# With headers:
# string = "Name,Value\nfoo,0\nbar,1\nbaz,2\n"
# csv = CSV.new(string, headers: true)
# csv.each do |row|
# p row
# end
# Output:
#
#
#
#
# ---
#
# Raises an exception if the source is not opened for reading:
# string = "foo,0\nbar,1\nbaz,2\n"
# csv = CSV.new(string)
# csv.close
# # Raises IOError (not opened for reading)
# csv.each do |row|
# p row
# end
def each(&block)
return to_enum(__method__) unless block_given?
begin
while true
yield(parser_enumerator.next)
end
rescue StopIteration
end
end
# :call-seq:
# csv.read -> array or csv_table
#
# Forms the remaining rows from +self+ into:
# - A CSV::Table object, if headers are in use.
# - An \Array of Arrays, otherwise.
#
# The data source must be opened for reading.
#
# Without headers:
# string = "foo,0\nbar,1\nbaz,2\n"
# path = 't.csv'
# File.write(path, string)
# csv = CSV.open(path)
# csv.read # => [["foo", "0"], ["bar", "1"], ["baz", "2"]]
#
# With headers:
# string = "Name,Value\nfoo,0\nbar,1\nbaz,2\n"
# path = 't.csv'
# File.write(path, string)
# csv = CSV.open(path, headers: true)
# csv.read # => #
#
# ---
#
# Raises an exception if the source is not opened for reading:
# string = "foo,0\nbar,1\nbaz,2\n"
# csv = CSV.new(string)
# csv.close
# # Raises IOError (not opened for reading)
# csv.read
def read
rows = to_a
if parser.use_headers?
Table.new(rows, headers: parser.headers)
else
rows
end
end
alias_method :readlines, :read
# :call-seq:
# csv.header_row? -> true or false
#
# Returns +true+ if the next row to be read is a header row\;
# +false+ otherwise.
#
# Without headers:
# string = "foo,0\nbar,1\nbaz,2\n"
# csv = CSV.new(string)
# csv.header_row? # => false
#
# With headers:
# string = "Name,Value\nfoo,0\nbar,1\nbaz,2\n"
# csv = CSV.new(string, headers: true)
# csv.header_row? # => true
# csv.shift # => #
# csv.header_row? # => false
#
# ---
#
# Raises an exception if the source is not opened for reading:
# string = "foo,0\nbar,1\nbaz,2\n"
# csv = CSV.new(string)
# csv.close
# # Raises IOError (not opened for reading)
# csv.header_row?
def header_row?
parser.header_row?
end
# :call-seq:
# csv.shift -> array, csv_row, or nil
#
# Returns the next row of data as:
# - An \Array if no headers are used.
# - A CSV::Row object if headers are used.
#
# The data source must be opened for reading.
#
# Without headers:
# string = "foo,0\nbar,1\nbaz,2\n"
# csv = CSV.new(string)
# csv.shift # => ["foo", "0"]
# csv.shift # => ["bar", "1"]
# csv.shift # => ["baz", "2"]
# csv.shift # => nil
#
# With headers:
# string = "Name,Value\nfoo,0\nbar,1\nbaz,2\n"
# csv = CSV.new(string, headers: true)
# csv.shift # => #
# csv.shift # => #
# csv.shift # => #
# csv.shift # => nil
#
# ---
#
# Raises an exception if the source is not opened for reading:
# string = "foo,0\nbar,1\nbaz,2\n"
# csv = CSV.new(string)
# csv.close
# # Raises IOError (not opened for reading)
# csv.shift
def shift
if @eof_error
eof_error, @eof_error = @eof_error, nil
raise eof_error
end
begin
parser_enumerator.next
rescue StopIteration
nil
end
end
alias_method :gets, :shift
alias_method :readline, :shift
# :call-seq:
# csv.inspect -> string
#
# Returns a \String showing certain properties of +self+:
# string = "Name,Value\nfoo,0\nbar,1\nbaz,2\n"
# csv = CSV.new(string, headers: true)
# s = csv.inspect
# s # => "#"
def inspect
str = ["#<", self.class.to_s, " io_type:"]
# show type of wrapped IO
if @io == $stdout then str << "$stdout"
elsif @io == $stdin then str << "$stdin"
elsif @io == $stderr then str << "$stderr"
else str << @io.class.to_s
end
# show IO.path(), if available
if @io.respond_to?(:path) and (p = @io.path)
str << " io_path:" << p.inspect
end
# show encoding
str << " encoding:" << @encoding.name
# show other attributes
["lineno", "col_sep", "row_sep", "quote_char"].each do |attr_name|
if a = __send__(attr_name)
str << " " << attr_name << ":" << a.inspect
end
end
["skip_blanks", "liberal_parsing"].each do |attr_name|
if a = __send__("#{attr_name}?")
str << " " << attr_name << ":" << a.inspect
end
end
_headers = headers
str << " headers:" << _headers.inspect if _headers
str << ">"
begin
str.join('')
rescue # any encoding error
str.map do |s|
e = Encoding::Converter.asciicompat_encoding(s.encoding)
e ? s.encode(e) : s.force_encoding("ASCII-8BIT")
end.join('')
end
end
private
def determine_encoding(encoding, internal_encoding)
# honor the IO encoding if we can, otherwise default to ASCII-8BIT
io_encoding = raw_encoding
return io_encoding if io_encoding
return Encoding.find(internal_encoding) if internal_encoding
if encoding
encoding, = encoding.split(":", 2) if encoding.is_a?(String)
return Encoding.find(encoding)
end
Encoding.default_internal || Encoding.default_external
end
def normalize_converters(converters)
converters ||= []
unless converters.is_a?(Array)
converters = [converters]
end
converters.collect do |converter|
case converter
when Proc # custom code block
[nil, converter]
else # by name
[converter, nil]
end
end
end
#
# Processes +fields+ with @converters, or @header_converters
# if +headers+ is passed as +true+, returning the converted field set. Any
# converter that changes the field into something other than a String halts
# the pipeline of conversion for that field. This is primarily an efficiency
# shortcut.
#
def convert_fields(fields, headers = false)
if headers
header_fields_converter.convert(fields, nil, 0)
else
parser_fields_converter.convert(fields, @headers, lineno)
end
end
#
# Returns the encoding of the internal IO object.
#
def raw_encoding
if @io.respond_to? :internal_encoding
@io.internal_encoding || @io.external_encoding
elsif @io.respond_to? :encoding
@io.encoding
else
nil
end
end
def parser_fields_converter
@parser_fields_converter ||= build_parser_fields_converter
end
def build_parser_fields_converter
specific_options = {
builtin_converters_name: :Converters,
}
options = @base_fields_converter_options.merge(specific_options)
build_fields_converter(@initial_converters, options)
end
def header_fields_converter
@header_fields_converter ||= build_header_fields_converter
end
def build_header_fields_converter
specific_options = {
builtin_converters_name: :HeaderConverters,
accept_nil: true,
}
options = @base_fields_converter_options.merge(specific_options)
build_fields_converter(@initial_header_converters, options)
end
def writer_fields_converter
@writer_fields_converter ||= build_writer_fields_converter
end
def build_writer_fields_converter
build_fields_converter(@initial_write_converters,
@write_fields_converter_options)
end
def build_fields_converter(initial_converters, options)
fields_converter = FieldsConverter.new(options)
normalize_converters(initial_converters).each do |name, converter|
fields_converter.add_converter(name, &converter)
end
fields_converter
end
def parser
@parser ||= Parser.new(@io, parser_options)
end
def parser_options
@parser_options.merge(header_fields_converter: header_fields_converter,
fields_converter: parser_fields_converter)
end
def parser_enumerator
@parser_enumerator ||= parser.parse
end
def writer
@writer ||= Writer.new(@io, writer_options)
end
def writer_options
@writer_options.merge(header_fields_converter: header_fields_converter,
fields_converter: writer_fields_converter)
end
end
# Passes +args+ to CSV::instance.
#
# CSV("CSV,data").read
# #=> [["CSV", "data"]]
#
# If a block is given, the instance is passed the block and the return value
# becomes the return value of the block.
#
# CSV("CSV,data") { |c|
# c.read.any? { |a| a.include?("data") }
# } #=> true
#
# CSV("CSV,data") { |c|
# c.read.any? { |a| a.include?("zombies") }
# } #=> false
#
# CSV options may also be given.
#
# io = StringIO.new
# CSV(io, col_sep: ";") { |csv| csv << ["a", "b", "c"] }
#
# This API is not Ractor-safe.
#
def CSV(*args, **options, &block)
CSV.instance(*args, **options, &block)
end
require_relative "csv/version"
require_relative "csv/core_ext/array"
require_relative "csv/core_ext/string"
csv-3.3.4/lib/csv/ 0000755 0000041 0000041 00000000000 15000146530 013677 5 ustar www-data www-data csv-3.3.4/lib/csv/row.rb 0000644 0000041 0000041 00000060277 15000146530 015047 0 ustar www-data www-data # frozen_string_literal: true
require "forwardable"
class CSV
# = \CSV::Row
# A \CSV::Row instance represents a \CSV table row.
# (see {class CSV}[../CSV.html]).
#
# The instance may have:
# - Fields: each is an object, not necessarily a \String.
# - Headers: each serves a key, and also need not be a \String.
#
# === Instance Methods
#
# \CSV::Row has three groups of instance methods:
# - Its own internally defined instance methods.
# - Methods included by module Enumerable.
# - Methods delegated to class Array.:
# * Array#empty?
# * Array#length
# * Array#size
#
# == Creating a \CSV::Row Instance
#
# Commonly, a new \CSV::Row instance is created by parsing \CSV source
# that has headers:
# source = "Name,Value\nfoo,0\nbar,1\nbaz,2\n"
# table = CSV.parse(source, headers: true)
# table.each {|row| p row }
# Output:
# #
# #
# #
#
# You can also create a row directly. See ::new.
#
# == Headers
#
# Like a \CSV::Table, a \CSV::Row has headers.
#
# A \CSV::Row that was created by parsing \CSV source
# inherits its headers from the table:
# source = "Name,Value\nfoo,0\nbar,1\nbaz,2\n"
# table = CSV.parse(source, headers: true)
# row = table.first
# row.headers # => ["Name", "Value"]
#
# You can also create a new row with headers;
# like the keys in a \Hash, the headers need not be Strings:
# row = CSV::Row.new([:name, :value], ['foo', 0])
# row.headers # => [:name, :value]
#
# The new row retains its headers even if added to a table
# that has headers:
# table << row # => #
# row.headers # => [:name, :value]
# row[:name] # => "foo"
# row['Name'] # => nil
#
#
#
# == Accessing Fields
#
# You may access a field in a \CSV::Row with either its \Integer index
# (\Array-style) or its header (\Hash-style).
#
# Fetch a field using method #[]:
# row = CSV::Row.new(['Name', 'Value'], ['foo', 0])
# row[1] # => 0
# row['Value'] # => 0
#
# Set a field using method #[]=:
# row = CSV::Row.new(['Name', 'Value'], ['foo', 0])
# row # => #
# row[0] = 'bar'
# row['Value'] = 1
# row # => #
#
class Row
# :call-seq:
# CSV::Row.new(headers, fields, header_row = false) -> csv_row
#
# Returns the new \CSV::Row instance constructed from
# arguments +headers+ and +fields+; both should be Arrays;
# note that the fields need not be Strings:
# row = CSV::Row.new(['Name', 'Value'], ['foo', 0])
# row # => #
#
# If the \Array lengths are different, the shorter is +nil+-filled:
# row = CSV::Row.new(['Name', 'Value', 'Date', 'Size'], ['foo', 0])
# row # => #
#
# Each \CSV::Row object is either a field row or a header row;
# by default, a new row is a field row; for the row created above:
# row.field_row? # => true
# row.header_row? # => false
#
# If the optional argument +header_row+ is given as +true+,
# the created row is a header row:
# row = CSV::Row.new(['Name', 'Value'], ['foo', 0], header_row = true)
# row # => #
# row.field_row? # => false
# row.header_row? # => true
def initialize(headers, fields, header_row = false)
@header_row = header_row
headers.each { |h| h.freeze if h.is_a? String }
# handle extra headers or fields
@row = if headers.size >= fields.size
headers.zip(fields)
else
fields.zip(headers).each(&:reverse!)
end
end
# Internal data format used to compare equality.
attr_reader :row
protected :row
### Array Delegation ###
extend Forwardable
def_delegators :@row, :empty?, :length, :size
# :call-seq:
# row.initialize_copy(other_row) -> self
#
# Calls superclass method.
def initialize_copy(other)
super_return_value = super
@row = @row.collect(&:dup)
super_return_value
end
# :call-seq:
# row.header_row? -> true or false
#
# Returns +true+ if this is a header row, +false+ otherwise.
def header_row?
@header_row
end
# :call-seq:
# row.field_row? -> true or false
#
# Returns +true+ if this is a field row, +false+ otherwise.
def field_row?
not header_row?
end
# :call-seq:
# row.headers -> array_of_headers
#
# Returns the headers for this row:
# source = "Name,Value\nfoo,0\nbar,1\nbaz,2\n"
# table = CSV.parse(source, headers: true)
# row = table.first
# row.headers # => ["Name", "Value"]
def headers
@row.map(&:first)
end
# :call-seq:
# field(index) -> value
# field(header) -> value
# field(header, offset) -> value
#
# Returns the field value for the given +index+ or +header+.
#
# ---
#
# Fetch field value by \Integer index:
# source = "Name,Value\nfoo,0\nbar,1\nbaz,2\n"
# table = CSV.parse(source, headers: true)
# row = table[0]
# row.field(0) # => "foo"
# row.field(1) # => "bar"
#
# Counts backward from the last column if +index+ is negative:
# row.field(-1) # => "0"
# row.field(-2) # => "foo"
#
# Returns +nil+ if +index+ is out of range:
# row.field(2) # => nil
# row.field(-3) # => nil
#
# ---
#
# Fetch field value by header (first found):
# source = "Name,Name,Name\nFoo,Bar,Baz\n"
# table = CSV.parse(source, headers: true)
# row = table[0]
# row.field('Name') # => "Foo"
#
# Fetch field value by header, ignoring +offset+ leading fields:
# source = "Name,Name,Name\nFoo,Bar,Baz\n"
# table = CSV.parse(source, headers: true)
# row = table[0]
# row.field('Name', 2) # => "Baz"
#
# Returns +nil+ if the header does not exist.
def field(header_or_index, minimum_index = 0)
# locate the pair
finder = (header_or_index.is_a?(Integer) || header_or_index.is_a?(Range)) ? :[] : :assoc
pair = @row[minimum_index..-1].public_send(finder, header_or_index)
# return the field if we have a pair
if pair.nil?
nil
else
header_or_index.is_a?(Range) ? pair.map(&:last) : pair.last
end
end
alias_method :[], :field
#
# :call-seq:
# fetch(header) -> value
# fetch(header, default) -> value
# fetch(header) {|row| ... } -> value
#
# Returns the field value as specified by +header+.
#
# ---
#
# With the single argument +header+, returns the field value
# for that header (first found):
# source = "Name,Name,Name\nFoo,Bar,Baz\n"
# table = CSV.parse(source, headers: true)
# row = table[0]
# row.fetch('Name') # => "Foo"
#
# Raises exception +KeyError+ if the header does not exist.
#
# ---
#
# With arguments +header+ and +default+ given,
# returns the field value for the header (first found)
# if the header exists, otherwise returns +default+:
# source = "Name,Name,Name\nFoo,Bar,Baz\n"
# table = CSV.parse(source, headers: true)
# row = table[0]
# row.fetch('Name', '') # => "Foo"
# row.fetch(:nosuch, '') # => ""
#
# ---
#
# With argument +header+ and a block given,
# returns the field value for the header (first found)
# if the header exists; otherwise calls the block
# and returns its return value:
# source = "Name,Name,Name\nFoo,Bar,Baz\n"
# table = CSV.parse(source, headers: true)
# row = table[0]
# row.fetch('Name') {|header| fail 'Cannot happen' } # => "Foo"
# row.fetch(:nosuch) {|header| "Header '#{header} not found'" } # => "Header 'nosuch not found'"
def fetch(header, *varargs)
raise ArgumentError, "Too many arguments" if varargs.length > 1
pair = @row.assoc(header)
if pair
pair.last
else
if block_given?
yield header
elsif varargs.empty?
raise KeyError, "key not found: #{header}"
else
varargs.first
end
end
end
# :call-seq:
# row.has_key?(header) -> true or false
#
# Returns +true+ if there is a field with the given +header+,
# +false+ otherwise.
def has_key?(header)
!!@row.assoc(header)
end
alias_method :include?, :has_key?
alias_method :key?, :has_key?
alias_method :member?, :has_key?
alias_method :header?, :has_key?
#
# :call-seq:
# row[index] = value -> value
# row[header, offset] = value -> value
# row[header] = value -> value
#
# Assigns the field value for the given +index+ or +header+;
# returns +value+.
#
# ---
#
# Assign field value by \Integer index:
# source = "Name,Value\nfoo,0\nbar,1\nbaz,2\n"
# table = CSV.parse(source, headers: true)
# row = table[0]
# row[0] = 'Bat'
# row[1] = 3
# row # => #
#
# Counts backward from the last column if +index+ is negative:
# row[-1] = 4
# row[-2] = 'Bam'
# row # => #
#
# Extends the row with nil:nil if positive +index+ is not in the row:
# row[4] = 5
# row # => #
#
# Raises IndexError if negative +index+ is too small (too far from zero).
#
# ---
#
# Assign field value by header (first found):
# source = "Name,Name,Name\nFoo,Bar,Baz\n"
# table = CSV.parse(source, headers: true)
# row = table[0]
# row['Name'] = 'Bat'
# row # => #
#
# Assign field value by header, ignoring +offset+ leading fields:
# source = "Name,Name,Name\nFoo,Bar,Baz\n"
# table = CSV.parse(source, headers: true)
# row = table[0]
# row['Name', 2] = 4
# row # => #
#
# Append new field by (new) header:
# source = "Name,Value\nfoo,0\nbar,1\nbaz,2\n"
# table = CSV.parse(source, headers: true)
# row = table[0]
# row['New'] = 6
# row# => #
def []=(*args)
value = args.pop
if args.first.is_a? Integer
if @row[args.first].nil? # extending past the end with index
@row[args.first] = [nil, value]
@row.map! { |pair| pair.nil? ? [nil, nil] : pair }
else # normal index assignment
@row[args.first][1] = value
end
else
index = index(*args)
if index.nil? # appending a field
self << [args.first, value]
else # normal header assignment
@row[index][1] = value
end
end
end
#
# :call-seq:
# row << [header, value] -> self
# row << hash -> self
# row << value -> self
#
# Adds a field to +self+; returns +self+:
#
# If the argument is a 2-element \Array [header, value],
# a field is added with the given +header+ and +value+:
# source = "Name,Name,Name\nFoo,Bar,Baz\n"
# table = CSV.parse(source, headers: true)
# row = table[0]
# row << ['NAME', 'Bat']
# row # => #
#
# If the argument is a \Hash, each key-value pair is added
# as a field with header +key+ and value +value+.
# source = "Name,Name,Name\nFoo,Bar,Baz\n"
# table = CSV.parse(source, headers: true)
# row = table[0]
# row << {NAME: 'Bat', name: 'Bam'}
# row # => #
#
# Otherwise, the given +value+ is added as a field with no header.
# source = "Name,Name,Name\nFoo,Bar,Baz\n"
# table = CSV.parse(source, headers: true)
# row = table[0]
# row << 'Bag'
# row # => #
def <<(arg)
if arg.is_a?(Array) and arg.size == 2 # appending a header and name
@row << arg
elsif arg.is_a?(Hash) # append header and name pairs
arg.each { |pair| @row << pair }
else # append field value
@row << [nil, arg]
end
self # for chaining
end
# :call-seq:
# row.push(*values) -> self
#
# Appends each of the given +values+ to +self+ as a field; returns +self+:
# source = "Name,Name,Name\nFoo,Bar,Baz\n"
# table = CSV.parse(source, headers: true)
# row = table[0]
# row.push('Bat', 'Bam')
# row # => #
def push(*args)
args.each { |arg| self << arg }
self # for chaining
end
#
# :call-seq:
# delete(index) -> [header, value] or nil
# delete(header) -> [header, value] or empty_array
# delete(header, offset) -> [header, value] or empty_array
#
# Removes a specified field from +self+; returns the 2-element \Array
# [header, value] if the field exists.
#
# If an \Integer argument +index+ is given,
# removes and returns the field at offset +index+,
# or returns +nil+ if the field does not exist:
# source = "Name,Name,Name\nFoo,Bar,Baz\n"
# table = CSV.parse(source, headers: true)
# row = table[0]
# row.delete(1) # => ["Name", "Bar"]
# row.delete(50) # => nil
#
# Otherwise, if the single argument +header+ is given,
# removes and returns the first-found field with the given header,
# of returns a new empty \Array if the field does not exist:
# source = "Name,Name,Name\nFoo,Bar,Baz\n"
# table = CSV.parse(source, headers: true)
# row = table[0]
# row.delete('Name') # => ["Name", "Foo"]
# row.delete('NAME') # => []
#
# If argument +header+ and \Integer argument +offset+ are given,
# removes and returns the first-found field with the given header
# whose +index+ is at least as large as +offset+:
# source = "Name,Name,Name\nFoo,Bar,Baz\n"
# table = CSV.parse(source, headers: true)
# row = table[0]
# row.delete('Name', 1) # => ["Name", "Bar"]
# row.delete('NAME', 1) # => []
def delete(header_or_index, minimum_index = 0)
if header_or_index.is_a? Integer # by index
@row.delete_at(header_or_index)
elsif i = index(header_or_index, minimum_index) # by header
@row.delete_at(i)
else
[ ]
end
end
# :call-seq:
# row.delete_if {|header, value| ... } -> self
#
# Removes fields from +self+ as selected by the block; returns +self+.
#
# Removes each field for which the block returns a truthy value:
# source = "Name,Name,Name\nFoo,Bar,Baz\n"
# table = CSV.parse(source, headers: true)
# row = table[0]
# row.delete_if {|header, value| value.start_with?('B') } # => true
# row # => #
# row.delete_if {|header, value| header.start_with?('B') } # => false
#
# If no block is given, returns a new Enumerator:
# row.delete_if # => #:delete_if>
def delete_if(&block)
return enum_for(__method__) { size } unless block_given?
@row.delete_if(&block)
self # for chaining
end
# :call-seq:
# self.fields(*specifiers) -> array_of_fields
#
# Returns field values per the given +specifiers+, which may be any mixture of:
# - \Integer index.
# - \Range of \Integer indexes.
# - 2-element \Array containing a header and offset.
# - Header.
# - \Range of headers.
#
# For +specifier+ in one of the first four cases above,
# returns the result of self.field(specifier); see #field.
#
# Although there may be any number of +specifiers+,
# the examples here will illustrate one at a time.
#
# When the specifier is an \Integer +index+,
# returns self.field(index)L
# source = "Name,Name,Name\nFoo,Bar,Baz\n"
# table = CSV.parse(source, headers: true)
# row = table[0]
# row.fields(1) # => ["Bar"]
#
# When the specifier is a \Range of \Integers +range+,
# returns self.field(range):
# row.fields(1..2) # => ["Bar", "Baz"]
#
# When the specifier is a 2-element \Array +array+,
# returns self.field(array)L
# row.fields('Name', 1) # => ["Foo", "Bar"]
#
# When the specifier is a header +header+,
# returns self.field(header)L
# row.fields('Name') # => ["Foo"]
#
# When the specifier is a \Range of headers +range+,
# forms a new \Range +new_range+ from the indexes of
# range.start and range.end,
# and returns self.field(new_range):
# source = "Name,NAME,name\nFoo,Bar,Baz\n"
# table = CSV.parse(source, headers: true)
# row = table[0]
# row.fields('Name'..'NAME') # => ["Foo", "Bar"]
#
# Returns all fields if no argument given:
# row.fields # => ["Foo", "Bar", "Baz"]
def fields(*headers_and_or_indices)
if headers_and_or_indices.empty? # return all fields--no arguments
@row.map(&:last)
else # or work like values_at()
all = []
headers_and_or_indices.each do |h_or_i|
if h_or_i.is_a? Range
index_begin = h_or_i.begin.is_a?(Integer) ? h_or_i.begin :
index(h_or_i.begin)
index_end = h_or_i.end.is_a?(Integer) ? h_or_i.end :
index(h_or_i.end)
new_range = h_or_i.exclude_end? ? (index_begin...index_end) :
(index_begin..index_end)
all.concat(fields.values_at(new_range))
else
all << field(*Array(h_or_i))
end
end
return all
end
end
alias_method :values_at, :fields
# :call-seq:
# index(header) -> index
# index(header, offset) -> index
#
# Returns the index for the given header, if it exists;
# otherwise returns +nil+.
#
# With the single argument +header+, returns the index
# of the first-found field with the given +header+:
# source = "Name,Name,Name\nFoo,Bar,Baz\n"
# table = CSV.parse(source, headers: true)
# row = table[0]
# row.index('Name') # => 0
# row.index('NAME') # => nil
#
# With arguments +header+ and +offset+,
# returns the index of the first-found field with given +header+,
# but ignoring the first +offset+ fields:
# row.index('Name', 1) # => 1
# row.index('Name', 3) # => nil
def index(header, minimum_index = 0)
# find the pair
index = headers[minimum_index..-1].index(header)
# return the index at the right offset, if we found one
index.nil? ? nil : index + minimum_index
end
# :call-seq:
# row.field?(value) -> true or false
#
# Returns +true+ if +value+ is a field in this row, +false+ otherwise:
# source = "Name,Name,Name\nFoo,Bar,Baz\n"
# table = CSV.parse(source, headers: true)
# row = table[0]
# row.field?('Bar') # => true
# row.field?('BAR') # => false
def field?(data)
fields.include? data
end
include Enumerable
# :call-seq:
# row.each {|header, value| ... } -> self
#
# Calls the block with each header-value pair; returns +self+:
# source = "Name,Name,Name\nFoo,Bar,Baz\n"
# table = CSV.parse(source, headers: true)
# row = table[0]
# row.each {|header, value| p [header, value] }
# Output:
# ["Name", "Foo"]
# ["Name", "Bar"]
# ["Name", "Baz"]
#
# If no block is given, returns a new Enumerator:
# row.each # => #:each>
def each(&block)
return enum_for(__method__) { size } unless block_given?
@row.each(&block)
self # for chaining
end
alias_method :each_pair, :each
# :call-seq:
# row == other -> true or false
#
# Returns +true+ if +other+ is a /CSV::Row that has the same
# fields (headers and values) in the same order as +self+;
# otherwise returns +false+:
# source = "Name,Name,Name\nFoo,Bar,Baz\n"
# table = CSV.parse(source, headers: true)
# row = table[0]
# other_row = table[0]
# row == other_row # => true
# other_row = table[1]
# row == other_row # => false
def ==(other)
return @row == other.row if other.is_a? CSV::Row
@row == other
end
# :call-seq:
# row.to_h -> hash
#
# Returns the new \Hash formed by adding each header-value pair in +self+
# as a key-value pair in the \Hash.
# source = "Name,Value\nfoo,0\nbar,1\nbaz,2\n"
# table = CSV.parse(source, headers: true)
# row = table[0]
# row.to_h # => {"Name"=>"foo", "Value"=>"0"}
#
# Header order is preserved, but repeated headers are ignored:
# source = "Name,Name,Name\nFoo,Bar,Baz\n"
# table = CSV.parse(source, headers: true)
# row = table[0]
# row.to_h # => {"Name"=>"Foo"}
def to_h
hash = {}
each do |key, _value|
hash[key] = self[key] unless hash.key?(key)
end
hash
end
alias_method :to_hash, :to_h
# :call-seq:
# row.deconstruct_keys(keys) -> hash
#
# Returns the new \Hash suitable for pattern matching containing only the
# keys specified as an argument.
def deconstruct_keys(keys)
if keys.nil?
to_h
else
keys.to_h { |key| [key, self[key]] }
end
end
alias_method :to_ary, :to_a
# :call-seq:
# row.deconstruct -> array
#
# Returns the new \Array suitable for pattern matching containing the values
# of the row.
def deconstruct
fields
end
# :call-seq:
# row.to_csv -> csv_string
#
# Returns the row as a \CSV String. Headers are not included:
# source = "Name,Value\nfoo,0\nbar,1\nbaz,2\n"
# table = CSV.parse(source, headers: true)
# row = table[0]
# row.to_csv # => "foo,0\n"
def to_csv(**options)
fields.to_csv(**options)
end
alias_method :to_s, :to_csv
# :call-seq:
# row.dig(index_or_header, *identifiers) -> object
#
# Finds and returns the object in nested object that is specified
# by +index_or_header+ and +specifiers+.
#
# The nested objects may be instances of various classes.
# See {Dig Methods}[rdoc-ref:dig_methods.rdoc].
#
# Examples:
# source = "Name,Value\nfoo,0\nbar,1\nbaz,2\n"
# table = CSV.parse(source, headers: true)
# row = table[0]
# row.dig(1) # => "0"
# row.dig('Value') # => "0"
# row.dig(5) # => nil
def dig(index_or_header, *indexes)
value = field(index_or_header)
if value.nil?
nil
elsif indexes.empty?
value
else
unless value.respond_to?(:dig)
raise TypeError, "#{value.class} does not have \#dig method"
end
value.dig(*indexes)
end
end
# :call-seq:
# row.inspect -> string
#
# Returns an ASCII-compatible \String showing:
# - Class \CSV::Row.
# - Header-value pairs.
# Example:
# source = "Name,Value\nfoo,0\nbar,1\nbaz,2\n"
# table = CSV.parse(source, headers: true)
# row = table[0]
# row.inspect # => "#"
def inspect
str = ["#<", self.class.to_s]
each do |header, field|
str << " " << (header.is_a?(Symbol) ? header.to_s : header.inspect) <<
":" << field.inspect
end
str << ">"
begin
str.join('')
rescue # any encoding error
str.map do |s|
e = Encoding::Converter.asciicompat_encoding(s.encoding)
e ? s.encode(e) : s.force_encoding("ASCII-8BIT")
end.join('')
end
end
end
end
csv-3.3.4/lib/csv/table.rb 0000644 0000041 0000041 00000112420 15000146530 015313 0 ustar www-data www-data # frozen_string_literal: true
require "forwardable"
class CSV
# = \CSV::Table
# A \CSV::Table instance represents \CSV data.
# (see {class CSV}[../CSV.html]).
#
# The instance may have:
# - Rows: each is a Table::Row object.
# - Headers: names for the columns.
#
# === Instance Methods
#
# \CSV::Table has three groups of instance methods:
# - Its own internally defined instance methods.
# - Methods included by module Enumerable.
# - Methods delegated to class Array.:
# * Array#empty?
# * Array#length
# * Array#size
#
# == Creating a \CSV::Table Instance
#
# Commonly, a new \CSV::Table instance is created by parsing \CSV source
# using headers:
# source = "Name,Value\nfoo,0\nbar,1\nbaz,2\n"
# table = CSV.parse(source, headers: true)
# table.class # => CSV::Table
#
# You can also create an instance directly. See ::new.
#
# == Headers
#
# If a table has headers, the headers serve as labels for the columns of data.
# Each header serves as the label for its column.
#
# The headers for a \CSV::Table object are stored as an \Array of Strings.
#
# Commonly, headers are defined in the first row of \CSV source:
# source = "Name,Value\nfoo,0\nbar,1\nbaz,2\n"
# table = CSV.parse(source, headers: true)
# table.headers # => ["Name", "Value"]
#
# If no headers are defined, the \Array is empty:
# table = CSV::Table.new([])
# table.headers # => []
#
# == Access Modes
#
# \CSV::Table provides three modes for accessing table data:
# - \Row mode.
# - Column mode.
# - Mixed mode (the default for a new table).
#
# The access mode for a\CSV::Table instance affects the behavior
# of some of its instance methods:
# - #[]
# - #[]=
# - #delete
# - #delete_if
# - #each
# - #values_at
#
# === \Row Mode
#
# Set a table to row mode with method #by_row!:
# source = "Name,Value\nfoo,0\nbar,1\nbaz,2\n"
# table = CSV.parse(source, headers: true)
# table.by_row! # => #
#
# Specify a single row by an \Integer index:
# # Get a row.
# table[1] # => #
# # Set a row, then get it.
# table[1] = CSV::Row.new(['Name', 'Value'], ['bam', 3])
# table[1] # => #
#
# Specify a sequence of rows by a \Range:
# # Get rows.
# table[1..2] # => [#, #]
# # Set rows, then get them.
# table[1..2] = [
# CSV::Row.new(['Name', 'Value'], ['bat', 4]),
# CSV::Row.new(['Name', 'Value'], ['bad', 5]),
# ]
# table[1..2] # => [["Name", #], ["Value", #]]
#
# === Column Mode
#
# Set a table to column mode with method #by_col!:
# source = "Name,Value\nfoo,0\nbar,1\nbaz,2\n"
# table = CSV.parse(source, headers: true)
# table.by_col! # => #
#
# Specify a column by an \Integer index:
# # Get a column.
# table[0]
# # Set a column, then get it.
# table[0] = ['FOO', 'BAR', 'BAZ']
# table[0] # => ["FOO", "BAR", "BAZ"]
#
# Specify a column by its \String header:
# # Get a column.
# table['Name'] # => ["FOO", "BAR", "BAZ"]
# # Set a column, then get it.
# table['Name'] = ['Foo', 'Bar', 'Baz']
# table['Name'] # => ["Foo", "Bar", "Baz"]
#
# === Mixed Mode
#
# In mixed mode, you can refer to either rows or columns:
# - An \Integer index refers to a row.
# - A \Range index refers to multiple rows.
# - A \String index refers to a column.
#
# Set a table to mixed mode with method #by_col_or_row!:
# source = "Name,Value\nfoo,0\nbar,1\nbaz,2\n"
# table = CSV.parse(source, headers: true)
# table.by_col_or_row! # => #
#
# Specify a single row by an \Integer index:
# # Get a row.
# table[1] # => #
# # Set a row, then get it.
# table[1] = CSV::Row.new(['Name', 'Value'], ['bam', 3])
# table[1] # => #
#
# Specify a sequence of rows by a \Range:
# # Get rows.
# table[1..2] # => [#, #]
# # Set rows, then get them.
# table[1] = CSV::Row.new(['Name', 'Value'], ['bat', 4])
# table[2] = CSV::Row.new(['Name', 'Value'], ['bad', 5])
# table[1..2] # => [["Name", #], ["Value", #]]
#
# Specify a column by its \String header:
# # Get a column.
# table['Name'] # => ["foo", "bat", "bad"]
# # Set a column, then get it.
# table['Name'] = ['Foo', 'Bar', 'Baz']
# table['Name'] # => ["Foo", "Bar", "Baz"]
class Table
# :call-seq:
# CSV::Table.new(array_of_rows, headers = nil) -> csv_table
#
# Returns a new \CSV::Table object.
#
# - Argument +array_of_rows+ must be an \Array of CSV::Row objects.
# - Argument +headers+, if given, may be an \Array of Strings.
#
# ---
#
# Create an empty \CSV::Table object:
# table = CSV::Table.new([])
# table # => #
#
# Create a non-empty \CSV::Table object:
# rows = [
# CSV::Row.new([], []),
# CSV::Row.new([], []),
# CSV::Row.new([], []),
# ]
# table = CSV::Table.new(rows)
# table # => #
#
# ---
#
# If argument +headers+ is an \Array of Strings,
# those Strings become the table's headers:
# table = CSV::Table.new([], headers: ['Name', 'Age'])
# table.headers # => ["Name", "Age"]
#
# If argument +headers+ is not given and the table has rows,
# the headers are taken from the first row:
# rows = [
# CSV::Row.new(['Foo', 'Bar'], []),
# CSV::Row.new(['foo', 'bar'], []),
# CSV::Row.new(['FOO', 'BAR'], []),
# ]
# table = CSV::Table.new(rows)
# table.headers # => ["Foo", "Bar"]
#
# If argument +headers+ is not given and the table is empty (has no rows),
# the headers are also empty:
# table = CSV::Table.new([])
# table.headers # => []
#
# ---
#
# Raises an exception if argument +array_of_rows+ is not an \Array object:
# # Raises NoMethodError (undefined method `first' for :foo:Symbol):
# CSV::Table.new(:foo)
#
# Raises an exception if an element of +array_of_rows+ is not a \CSV::Table object:
# # Raises NoMethodError (undefined method `headers' for :foo:Symbol):
# CSV::Table.new([:foo])
def initialize(array_of_rows, headers: nil)
@table = array_of_rows
@headers = headers
unless @headers
if @table.empty?
@headers = []
else
@headers = @table.first.headers
end
end
@mode = :col_or_row
end
# The current access mode for indexing and iteration.
attr_reader :mode
# Internal data format used to compare equality.
attr_reader :table
protected :table
### Array Delegation ###
extend Forwardable
def_delegators :@table, :empty?, :length, :size
# :call-seq:
# table.by_col -> table_dup
#
# Returns a duplicate of +self+, in column mode
# (see {Column Mode}[#class-CSV::Table-label-Column+Mode]):
# source = "Name,Value\nfoo,0\nbar,1\nbaz,2\n"
# table = CSV.parse(source, headers: true)
# table.mode # => :col_or_row
# dup_table = table.by_col
# dup_table.mode # => :col
# dup_table.equal?(table) # => false # It's a dup
#
# This may be used to chain method calls without changing the mode
# (but also will affect performance and memory usage):
# dup_table.by_col['Name']
#
# Also note that changes to the duplicate table will not affect the original.
def by_col
self.class.new(@table.dup).by_col!
end
# :call-seq:
# table.by_col! -> self
#
# Sets the mode for +self+ to column mode
# (see {Column Mode}[#class-CSV::Table-label-Column+Mode]); returns +self+:
# source = "Name,Value\nfoo,0\nbar,1\nbaz,2\n"
# table = CSV.parse(source, headers: true)
# table.mode # => :col_or_row
# table1 = table.by_col!
# table.mode # => :col
# table1.equal?(table) # => true # Returned self
def by_col!
@mode = :col
self
end
# :call-seq:
# table.by_col_or_row -> table_dup
#
# Returns a duplicate of +self+, in mixed mode
# (see {Mixed Mode}[#class-CSV::Table-label-Mixed+Mode]):
# source = "Name,Value\nfoo,0\nbar,1\nbaz,2\n"
# table = CSV.parse(source, headers: true).by_col!
# table.mode # => :col
# dup_table = table.by_col_or_row
# dup_table.mode # => :col_or_row
# dup_table.equal?(table) # => false # It's a dup
#
# This may be used to chain method calls without changing the mode
# (but also will affect performance and memory usage):
# dup_table.by_col_or_row['Name']
#
# Also note that changes to the duplicate table will not affect the original.
def by_col_or_row
self.class.new(@table.dup).by_col_or_row!
end
# :call-seq:
# table.by_col_or_row! -> self
#
# Sets the mode for +self+ to mixed mode
# (see {Mixed Mode}[#class-CSV::Table-label-Mixed+Mode]); returns +self+:
# source = "Name,Value\nfoo,0\nbar,1\nbaz,2\n"
# table = CSV.parse(source, headers: true).by_col!
# table.mode # => :col
# table1 = table.by_col_or_row!
# table.mode # => :col_or_row
# table1.equal?(table) # => true # Returned self
def by_col_or_row!
@mode = :col_or_row
self
end
# :call-seq:
# table.by_row -> table_dup
#
# Returns a duplicate of +self+, in row mode
# (see {Row Mode}[#class-CSV::Table-label-Row+Mode]):
# source = "Name,Value\nfoo,0\nbar,1\nbaz,2\n"
# table = CSV.parse(source, headers: true)
# table.mode # => :col_or_row
# dup_table = table.by_row
# dup_table.mode # => :row
# dup_table.equal?(table) # => false # It's a dup
#
# This may be used to chain method calls without changing the mode
# (but also will affect performance and memory usage):
# dup_table.by_row[1]
#
# Also note that changes to the duplicate table will not affect the original.
def by_row
self.class.new(@table.dup).by_row!
end
# :call-seq:
# table.by_row! -> self
#
# Sets the mode for +self+ to row mode
# (see {Row Mode}[#class-CSV::Table-label-Row+Mode]); returns +self+:
# source = "Name,Value\nfoo,0\nbar,1\nbaz,2\n"
# table = CSV.parse(source, headers: true)
# table.mode # => :col_or_row
# table1 = table.by_row!
# table.mode # => :row
# table1.equal?(table) # => true # Returned self
def by_row!
@mode = :row
self
end
# :call-seq:
# table.headers -> array_of_headers
#
# Returns a new \Array containing the \String headers for the table.
#
# If the table is not empty, returns the headers from the first row:
# rows = [
# CSV::Row.new(['Foo', 'Bar'], []),
# CSV::Row.new(['FOO', 'BAR'], []),
# CSV::Row.new(['foo', 'bar'], []),
# ]
# table = CSV::Table.new(rows)
# table.headers # => ["Foo", "Bar"]
# table.delete(0)
# table.headers # => ["FOO", "BAR"]
# table.delete(0)
# table.headers # => ["foo", "bar"]
#
# If the table is empty, returns a copy of the headers in the table itself:
# table.delete(0)
# table.headers # => ["Foo", "Bar"]
def headers
if @table.empty?
@headers.dup
else
@table.first.headers
end
end
# :call-seq:
# table[n] -> row or column_data
# table[range] -> array_of_rows or array_of_column_data
# table[header] -> array_of_column_data
#
# Returns data from the table; does not modify the table.
#
# ---
#
# Fetch a \Row by Its \Integer Index::
# - Form: table[n], +n+ an integer.
# - Access mode: :row or :col_or_row.
# - Return value: _nth_ row of the table, if that row exists;
# otherwise +nil+.
#
# Returns the _nth_ row of the table if that row exists:
# source = "Name,Value\nfoo,0\nbar,1\nbaz,2\n"
# table = CSV.parse(source, headers: true)
# table.by_row! # => #
# table[1] # => #
# table.by_col_or_row! # => #
# table[1] # => #
#
# Counts backward from the last row if +n+ is negative:
# table[-1] # => #
#
# Returns +nil+ if +n+ is too large or too small:
# table[4] # => nil
# table[-4] # => nil
#
# Raises an exception if the access mode is :row
# and +n+ is not an \Integer:
# table.by_row! # => #
# # Raises TypeError (no implicit conversion of String into Integer):
# table['Name']
#
# ---
#
# Fetch a Column by Its \Integer Index::
# - Form: table[n], +n+ an \Integer.
# - Access mode: :col.
# - Return value: _nth_ column of the table, if that column exists;
# otherwise an \Array of +nil+ fields of length self.size.
#
# Returns the _nth_ column of the table if that column exists:
# source = "Name,Value\nfoo,0\nbar,1\nbaz,2\n"
# table = CSV.parse(source, headers: true)
# table.by_col! # => #
# table[1] # => ["0", "1", "2"]
#
# Counts backward from the last column if +n+ is negative:
# table[-2] # => ["foo", "bar", "baz"]
#
# Returns an \Array of +nil+ fields if +n+ is too large or too small:
# table[4] # => [nil, nil, nil]
# table[-4] # => [nil, nil, nil]
#
# ---
#
# Fetch Rows by \Range::
# - Form: table[range], +range+ a \Range object.
# - Access mode: :row or :col_or_row.
# - Return value: rows from the table, beginning at row range.start,
# if those rows exists.
#
# Returns rows from the table, beginning at row range.first,
# if those rows exist:
# source = "Name,Value\nfoo,0\nbar,1\nbaz,2\n"
# table = CSV.parse(source, headers: true)
# table.by_row! # => #
# rows = table[1..2] # => #
# rows # => [#, #]
# table.by_col_or_row! # => #
# rows = table[1..2] # => #
# rows # => [#, #]
#
# If there are too few rows, returns all from range.start to the end:
# rows = table[1..50] # => #
# rows # => [#, #]
#
# Special case: if range.start == table.size, returns an empty \Array:
# table[table.size..50] # => []
#
# If range.end is negative, calculates the ending index from the end:
# rows = table[0..-1]
# rows # => [#, #, #]
#
# If range.start is negative, calculates the starting index from the end:
# rows = table[-1..2]
# rows # => [#]
#
# If range.start is larger than table.size, returns +nil+:
# table[4..4] # => nil
#
# ---
#
# Fetch Columns by \Range::
# - Form: table[range], +range+ a \Range object.
# - Access mode: :col.
# - Return value: column data from the table, beginning at column range.start,
# if those columns exist.
#
# Returns column values from the table, if the column exists;
# the values are arranged by row:
# source = "Name,Value\nfoo,0\nbar,1\nbaz,2\n"
# table = CSV.parse(source, headers: true)
# table.by_col!
# table[0..1] # => [["foo", "0"], ["bar", "1"], ["baz", "2"]]
#
# Special case: if range.start == headers.size,
# returns an \Array (size: table.size) of empty \Arrays:
# table[table.headers.size..50] # => [[], [], []]
#
# If range.end is negative, calculates the ending index from the end:
# table[0..-1] # => [["foo", "0"], ["bar", "1"], ["baz", "2"]]
#
# If range.start is negative, calculates the starting index from the end:
# table[-2..2] # => [["foo", "0"], ["bar", "1"], ["baz", "2"]]
#
# If range.start is larger than table.size,
# returns an \Array of +nil+ values:
# table[4..4] # => [nil, nil, nil]
#
# ---
#
# Fetch a Column by Its \String Header::
# - Form: table[header], +header+ a \String header.
# - Access mode: :col or :col_or_row
# - Return value: column data from the table, if that +header+ exists.
#
# Returns column values from the table, if the column exists:
# source = "Name,Value\nfoo,0\nbar,1\nbaz,2\n"
# table = CSV.parse(source, headers: true)
# table.by_col! # => #
# table['Name'] # => ["foo", "bar", "baz"]
# table.by_col_or_row! # => #
# col = table['Name']
# col # => ["foo", "bar", "baz"]
#
# Modifying the returned column values does not modify the table:
# col[0] = 'bat'
# col # => ["bat", "bar", "baz"]
# table['Name'] # => ["foo", "bar", "baz"]
#
# Returns an \Array of +nil+ values if there is no such column:
# table['Nosuch'] # => [nil, nil, nil]
def [](index_or_header)
if @mode == :row or # by index
(@mode == :col_or_row and (index_or_header.is_a?(Integer) or index_or_header.is_a?(Range)))
@table[index_or_header]
else # by header
@table.map { |row| row[index_or_header] }
end
end
# :call-seq:
# table[n] = row -> row
# table[n] = field_or_array_of_fields -> field_or_array_of_fields
# table[header] = field_or_array_of_fields -> field_or_array_of_fields
#
# Puts data onto the table.
#
# ---
#
# Set a \Row by Its \Integer Index::
# - Form: table[n] = row, +n+ an \Integer,
# +row+ a \CSV::Row instance or an \Array of fields.
# - Access mode: :row or :col_or_row.
# - Return value: +row+.
#
# If the row exists, it is replaced:
# source = "Name,Value\nfoo,0\nbar,1\nbaz,2\n"
# table = CSV.parse(source, headers: true)
# new_row = CSV::Row.new(['Name', 'Value'], ['bat', 3])
# table.by_row! # => #
# return_value = table[0] = new_row
# return_value.equal?(new_row) # => true # Returned the row
# table[0].to_h # => {"Name"=>"bat", "Value"=>3}
#
# With access mode :col_or_row:
# table.by_col_or_row! # => #
# table[0] = CSV::Row.new(['Name', 'Value'], ['bam', 4])
# table[0].to_h # => {"Name"=>"bam", "Value"=>4}
#
# With an \Array instead of a \CSV::Row, inherits headers from the table:
# array = ['bad', 5]
# return_value = table[0] = array
# return_value.equal?(array) # => true # Returned the array
# table[0].to_h # => {"Name"=>"bad", "Value"=>5}
#
# If the row does not exist, extends the table by adding rows:
# assigns rows with +nil+ as needed:
# table.size # => 3
# table[5] = ['bag', 6]
# table.size # => 6
# table[3] # => nil
# table[4]# => nil
# table[5].to_h # => {"Name"=>"bag", "Value"=>6}
#
# Note that the +nil+ rows are actually +nil+, not a row of +nil+ fields.
#
# ---
#
# Set a Column by Its \Integer Index::
# - Form: table[n] = array_of_fields, +n+ an \Integer,
# +array_of_fields+ an \Array of \String fields.
# - Access mode: :col.
# - Return value: +array_of_fields+.
#
# If the column exists, it is replaced:
# source = "Name,Value\nfoo,0\nbar,1\nbaz,2\n"
# table = CSV.parse(source, headers: true)
# new_col = [3, 4, 5]
# table.by_col! # => #
# return_value = table[1] = new_col
# return_value.equal?(new_col) # => true # Returned the column
# table[1] # => [3, 4, 5]
# # The rows, as revised:
# table.by_row! # => #
# table[0].to_h # => {"Name"=>"foo", "Value"=>3}
# table[1].to_h # => {"Name"=>"bar", "Value"=>4}
# table[2].to_h # => {"Name"=>"baz", "Value"=>5}
# table.by_col! # => #
#
# If there are too few values, fills with +nil+ values:
# table[1] = [0]
# table[1] # => [0, nil, nil]
#
# If there are too many values, ignores the extra values:
# table[1] = [0, 1, 2, 3, 4]
# table[1] # => [0, 1, 2]
#
# If a single value is given, replaces all fields in the column with that value:
# table[1] = 'bat'
# table[1] # => ["bat", "bat", "bat"]
#
# ---
#
# Set a Column by Its \String Header::
# - Form: table[header] = field_or_array_of_fields,
# +header+ a \String header, +field_or_array_of_fields+ a field value
# or an \Array of \String fields.
# - Access mode: :col or :col_or_row.
# - Return value: +field_or_array_of_fields+.
#
# If the column exists, it is replaced:
# source = "Name,Value\nfoo,0\nbar,1\nbaz,2\n"
# table = CSV.parse(source, headers: true)
# new_col = [3, 4, 5]
# table.by_col! # => #
# return_value = table['Value'] = new_col
# return_value.equal?(new_col) # => true # Returned the column
# table['Value'] # => [3, 4, 5]
# # The rows, as revised:
# table.by_row! # => #
# table[0].to_h # => {"Name"=>"foo", "Value"=>3}
# table[1].to_h # => {"Name"=>"bar", "Value"=>4}
# table[2].to_h # => {"Name"=>"baz", "Value"=>5}
# table.by_col! # => #
#
# If there are too few values, fills with +nil+ values:
# table['Value'] = [0]
# table['Value'] # => [0, nil, nil]
#
# If there are too many values, ignores the extra values:
# table['Value'] = [0, 1, 2, 3, 4]
# table['Value'] # => [0, 1, 2]
#
# If the column does not exist, extends the table by adding columns:
# table['Note'] = ['x', 'y', 'z']
# table['Note'] # => ["x", "y", "z"]
# # The rows, as revised:
# table.by_row!
# table[0].to_h # => {"Name"=>"foo", "Value"=>0, "Note"=>"x"}
# table[1].to_h # => {"Name"=>"bar", "Value"=>1, "Note"=>"y"}
# table[2].to_h # => {"Name"=>"baz", "Value"=>2, "Note"=>"z"}
# table.by_col!
#
# If a single value is given, replaces all fields in the column with that value:
# table['Value'] = 'bat'
# table['Value'] # => ["bat", "bat", "bat"]
def []=(index_or_header, value)
if @mode == :row or # by index
(@mode == :col_or_row and index_or_header.is_a? Integer)
if value.is_a? Array
@table[index_or_header] = Row.new(headers, value)
else
@table[index_or_header] = value
end
else # set column
unless index_or_header.is_a? Integer
index = @headers.index(index_or_header) || @headers.size
@headers[index] = index_or_header
end
if value.is_a? Array # multiple values
@table.each_with_index do |row, i|
if row.header_row?
row[index_or_header] = index_or_header
else
row[index_or_header] = value[i]
end
end
else # repeated value
@table.each do |row|
if row.header_row?
row[index_or_header] = index_or_header
else
row[index_or_header] = value
end
end
end
end
end
# :call-seq:
# table.values_at(*indexes) -> array_of_rows
# table.values_at(*headers) -> array_of_columns_data
#
# If the access mode is :row or :col_or_row,
# and each argument is either an \Integer or a \Range,
# returns rows.
# Otherwise, returns columns data.
#
# In either case, the returned values are in the order
# specified by the arguments. Arguments may be repeated.
#
# ---
#
# Returns rows as an \Array of \CSV::Row objects.
#
# No argument:
# source = "Name,Value\nfoo,0\nbar,1\nbaz,2\n"
# table = CSV.parse(source, headers: true)
# table.values_at # => []
#
# One index:
# values = table.values_at(0)
# values # => [#]
#
# Two indexes:
# values = table.values_at(2, 0)
# values # => [#, #]
#
# One \Range:
# values = table.values_at(1..2)
# values # => [#, #]
#
# \Ranges and indexes:
# values = table.values_at(0..1, 1..2, 0, 2)
# pp values
# Output:
# [#,
# #,
# #,
# #,
# #,
# #]
#
# ---
#
# Returns columns data as row Arrays,
# each consisting of the specified columns data for that row:
# values = table.values_at('Name')
# values # => [["foo"], ["bar"], ["baz"]]
# values = table.values_at('Value', 'Name')
# values # => [["0", "foo"], ["1", "bar"], ["2", "baz"]]
def values_at(*indices_or_headers)
if @mode == :row or # by indices
( @mode == :col_or_row and indices_or_headers.all? do |index|
index.is_a?(Integer) or
( index.is_a?(Range) and
index.first.is_a?(Integer) and
index.last.is_a?(Integer) )
end )
@table.values_at(*indices_or_headers)
else # by headers
@table.map { |row| row.values_at(*indices_or_headers) }
end
end
# :call-seq:
# table << row_or_array -> self
#
# If +row_or_array+ is a \CSV::Row object,
# it is appended to the table:
# source = "Name,Value\nfoo,0\nbar,1\nbaz,2\n"
# table = CSV.parse(source, headers: true)
# table << CSV::Row.new(table.headers, ['bat', 3])
# table[3] # => #
#
# If +row_or_array+ is an \Array, it is used to create a new
# \CSV::Row object which is then appended to the table:
# table << ['bam', 4]
# table[4] # => #
def <<(row_or_array)
if row_or_array.is_a? Array # append Array
@table << Row.new(headers, row_or_array)
else # append Row
@table << row_or_array
end
self # for chaining
end
#
# :call-seq:
# table.push(*rows_or_arrays) -> self
#
# A shortcut for appending multiple rows. Equivalent to:
# rows.each {|row| self << row }
#
# Each argument may be either a \CSV::Row object or an \Array:
# source = "Name,Value\nfoo,0\nbar,1\nbaz,2\n"
# table = CSV.parse(source, headers: true)
# rows = [
# CSV::Row.new(table.headers, ['bat', 3]),
# ['bam', 4]
# ]
# table.push(*rows)
# table[3..4] # => [#, #]
def push(*rows)
rows.each { |row| self << row }
self # for chaining
end
# :call-seq:
# table.delete(*indexes) -> deleted_values
# table.delete(*headers) -> deleted_values
#
# If the access mode is :row or :col_or_row,
# and each argument is either an \Integer or a \Range,
# returns deleted rows.
# Otherwise, returns deleted columns data.
#
# In either case, the returned values are in the order
# specified by the arguments. Arguments may be repeated.
#
# ---
#
# Returns rows as an \Array of \CSV::Row objects.
#
# One index:
# source = "Name,Value\nfoo,0\nbar,1\nbaz,2\n"
# table = CSV.parse(source, headers: true)
# deleted_values = table.delete(0)
# deleted_values # => [#]
#
# Two indexes:
# table = CSV.parse(source, headers: true)
# deleted_values = table.delete(2, 0)
# deleted_values # => [#, #]
#
# ---
#
# Returns columns data as column Arrays.
#
# One header:
# table = CSV.parse(source, headers: true)
# deleted_values = table.delete('Name')
# deleted_values # => ["foo", "bar", "baz"]
#
# Two headers:
# table = CSV.parse(source, headers: true)
# deleted_values = table.delete('Value', 'Name')
# deleted_values # => [["0", "1", "2"], ["foo", "bar", "baz"]]
def delete(*indexes_or_headers)
if indexes_or_headers.empty?
raise ArgumentError, "wrong number of arguments (given 0, expected 1+)"
end
deleted_values = indexes_or_headers.map do |index_or_header|
if @mode == :row or # by index
(@mode == :col_or_row and index_or_header.is_a? Integer)
@table.delete_at(index_or_header)
else # by header
if index_or_header.is_a? Integer
@headers.delete_at(index_or_header)
else
@headers.delete(index_or_header)
end
@table.map { |row| row.delete(index_or_header).last }
end
end
if indexes_or_headers.size == 1
deleted_values[0]
else
deleted_values
end
end
# :call-seq:
# table.delete_if {|row_or_column| ... } -> self
#
# Removes rows or columns for which the block returns a truthy value;
# returns +self+.
#
# Removes rows when the access mode is :row or :col_or_row;
# calls the block with each \CSV::Row object:
# source = "Name,Value\nfoo,0\nbar,1\nbaz,2\n"
# table = CSV.parse(source, headers: true)
# table.by_row! # => #
# table.size # => 3
# table.delete_if {|row| row['Name'].start_with?('b') }
# table.size # => 1
#
# Removes columns when the access mode is :col;
# calls the block with each column as a 2-element array
# containing the header and an \Array of column fields:
# source = "Name,Value\nfoo,0\nbar,1\nbaz,2\n"
# table = CSV.parse(source, headers: true)
# table.by_col! # => #
# table.headers.size # => 2
# table.delete_if {|column_data| column_data[1].include?('2') }
# table.headers.size # => 1
#
# Returns a new \Enumerator if no block is given:
# source = "Name,Value\nfoo,0\nbar,1\nbaz,2\n"
# table = CSV.parse(source, headers: true)
# table.delete_if # => #:delete_if>
def delete_if(&block)
return enum_for(__method__) { @mode == :row or @mode == :col_or_row ? size : headers.size } unless block_given?
if @mode == :row or @mode == :col_or_row # by index
@table.delete_if(&block)
else # by header
headers.each do |header|
delete(header) if yield([header, self[header]])
end
end
self # for chaining
end
include Enumerable
# :call-seq:
# table.each {|row_or_column| ... ) -> self
#
# Calls the block with each row or column; returns +self+.
#
# When the access mode is :row or :col_or_row,
# calls the block with each \CSV::Row object:
# source = "Name,Value\nfoo,0\nbar,1\nbaz,2\n"
# table = CSV.parse(source, headers: true)
# table.by_row! # => #
# table.each {|row| p row }
# Output:
# #
# #
# #
#
# When the access mode is :col,
# calls the block with each column as a 2-element array
# containing the header and an \Array of column fields:
# table.by_col! # => #
# table.each {|column_data| p column_data }
# Output:
# ["Name", ["foo", "bar", "baz"]]
# ["Value", ["0", "1", "2"]]
#
# Returns a new \Enumerator if no block is given:
# table.each # => #:each>
def each(&block)
return enum_for(__method__) { @mode == :col ? headers.size : size } unless block_given?
if @mode == :col
headers.each.with_index do |header, i|
yield([header, @table.map {|row| row[header, i]}])
end
else
@table.each(&block)
end
self # for chaining
end
# :call-seq:
# table == other_table -> true or false
#
# Returns +true+ if all each row of +self+ ==
# the corresponding row of +other_table+, otherwise, +false+.
#
# The access mode does no affect the result.
#
# Equal tables:
# source = "Name,Value\nfoo,0\nbar,1\nbaz,2\n"
# table = CSV.parse(source, headers: true)
# other_table = CSV.parse(source, headers: true)
# table == other_table # => true
#
# Different row count:
# other_table.delete(2)
# table == other_table # => false
#
# Different last row:
# other_table << ['bat', 3]
# table == other_table # => false
def ==(other)
return @table == other.table if other.is_a? CSV::Table
@table == other
end
# :call-seq:
# table.to_a -> array_of_arrays
#
# Returns the table as an \Array of \Arrays;
# the headers are in the first row:
# source = "Name,Value\nfoo,0\nbar,1\nbaz,2\n"
# table = CSV.parse(source, headers: true)
# table.to_a # => [["Name", "Value"], ["foo", "0"], ["bar", "1"], ["baz", "2"]]
def to_a
array = [headers]
@table.each do |row|
array.push(row.fields) unless row.header_row?
end
array
end
# :call-seq:
# table.to_csv(**options) -> csv_string
#
# Returns the table as \CSV string.
# See {Options for Generating}[../CSV.html#class-CSV-label-Options+for+Generating].
#
# Defaults option +write_headers+ to +true+:
# source = "Name,Value\nfoo,0\nbar,1\nbaz,2\n"
# table = CSV.parse(source, headers: true)
# table.to_csv # => "Name,Value\nfoo,0\nbar,1\nbaz,2\n"
#
# Omits the headers if option +write_headers+ is given as +false+
# (see {Option +write_headers+}[../CSV.html#class-CSV-label-Option+write_headers]):
# table.to_csv(write_headers: false) # => "foo,0\nbar,1\nbaz,2\n"
#
# Limit rows if option +limit+ is given like +2+:
# table.to_csv(limit: 2) # => "Name,Value\nfoo,0\nbar,1\n"
def to_csv(write_headers: true, limit: nil, **options)
array = write_headers ? [headers.to_csv(**options)] : []
limit ||= @table.size
limit = @table.size + 1 + limit if limit < 0
limit = 0 if limit < 0
@table.first(limit).each do |row|
array.push(row.fields.to_csv(**options)) unless row.header_row?
end
array.join("")
end
alias_method :to_s, :to_csv
#
# Extracts the nested value specified by the sequence of +index+ or +header+ objects by calling dig at each step,
# returning nil if any intermediate step is nil.
#
def dig(index_or_header, *index_or_headers)
value = self[index_or_header]
if value.nil?
nil
elsif index_or_headers.empty?
value
else
unless value.respond_to?(:dig)
raise TypeError, "#{value.class} does not have \#dig method"
end
value.dig(*index_or_headers)
end
end
# :call-seq:
# table.inspect => string
#
# Returns a US-ASCII-encoded \String showing table:
# - Class: CSV::Table.
# - Access mode: :row, :col, or :col_or_row.
# - Size: Row count, including the header row.
#
# Example:
# source = "Name,Value\nfoo,0\nbar,1\nbaz,2\n"
# table = CSV.parse(source, headers: true)
# table.inspect # => "#\nName,Value\nfoo,0\nbar,1\nbaz,2\n"
#
def inspect
inspected = +"#<#{self.class} mode:#{@mode} row_count:#{to_a.size}>"
summary = to_csv(limit: 5)
inspected << "\n" << summary if summary.encoding.ascii_compatible?
inspected
end
end
end
csv-3.3.4/lib/csv/writer.rb 0000644 0000041 0000041 00000013476 15000146530 015553 0 ustar www-data www-data # frozen_string_literal: true
require_relative "input_record_separator"
require_relative "row"
class CSV
# Note: Don't use this class directly. This is an internal class.
class Writer
#
# A CSV::Writer receives an output, prepares the header, format and output.
# It allows us to write new rows in the object and rewind it.
#
attr_reader :lineno
attr_reader :headers
def initialize(output, options)
@output = output
@options = options
@lineno = 0
@fields_converter = nil
prepare
if @options[:write_headers] and @headers
self << @headers
end
@fields_converter = @options[:fields_converter]
end
#
# Adds a new row
#
def <<(row)
case row
when Row
row = row.fields
when Hash
row = @headers.collect {|header| row[header]}
end
@headers ||= row if @use_headers
@lineno += 1
if @fields_converter
row = @fields_converter.convert(row, nil, lineno)
end
i = -1
converted_row = row.collect do |field|
i += 1
quote(field, i)
end
line = converted_row.join(@column_separator) + @row_separator
if @output_encoding
line = line.encode(@output_encoding)
end
@output << line
self
end
#
# Winds back to the beginning
#
def rewind
@lineno = 0
@headers = nil if @options[:headers].nil?
end
private
def prepare
@encoding = @options[:encoding]
prepare_header
prepare_format
prepare_output
end
def prepare_header
headers = @options[:headers]
case headers
when Array
@headers = headers
@use_headers = true
when String
@headers = CSV.parse_line(headers,
col_sep: @options[:column_separator],
row_sep: @options[:row_separator],
quote_char: @options[:quote_character])
@use_headers = true
when true
@headers = nil
@use_headers = true
else
@headers = nil
@use_headers = false
end
return unless @headers
converter = @options[:header_fields_converter]
@headers = converter.convert(@headers, nil, 0, [])
@headers.each do |header|
header.freeze if header.is_a?(String)
end
end
def prepare_force_quotes_fields(force_quotes)
@force_quotes_fields = {}
force_quotes.each do |name_or_index|
case name_or_index
when Integer
index = name_or_index
@force_quotes_fields[index] = true
when String, Symbol
name = name_or_index.to_s
if @headers.nil?
message = ":headers is required when you use field name " +
"in :force_quotes: " +
"#{name_or_index.inspect}: #{force_quotes.inspect}"
raise ArgumentError, message
end
index = @headers.index(name)
next if index.nil?
@force_quotes_fields[index] = true
else
message = ":force_quotes element must be " +
"field index or field name: " +
"#{name_or_index.inspect}: #{force_quotes.inspect}"
raise ArgumentError, message
end
end
end
def prepare_format
@column_separator = @options[:column_separator].to_s.encode(@encoding)
row_separator = @options[:row_separator]
if row_separator == :auto
@row_separator = InputRecordSeparator.value.encode(@encoding)
else
@row_separator = row_separator.to_s.encode(@encoding)
end
@quote_character = @options[:quote_character]
force_quotes = @options[:force_quotes]
if force_quotes.is_a?(Array)
prepare_force_quotes_fields(force_quotes)
@force_quotes = false
elsif force_quotes
@force_quotes_fields = nil
@force_quotes = true
else
@force_quotes_fields = nil
@force_quotes = false
end
unless @force_quotes
@quotable_pattern =
Regexp.new("[\r\n".encode(@encoding) +
Regexp.escape(@column_separator) +
Regexp.escape(@quote_character.encode(@encoding)) +
"]".encode(@encoding))
end
@quote_empty = @options.fetch(:quote_empty, true)
end
def prepare_output
@output_encoding = nil
return unless @output.is_a?(StringIO)
output_encoding = @output.internal_encoding || @output.external_encoding
if @encoding != output_encoding
if @options[:force_encoding]
@output_encoding = output_encoding
else
compatible_encoding = Encoding.compatible?(@encoding, output_encoding)
if compatible_encoding
@output.set_encoding(compatible_encoding)
@output.seek(0, IO::SEEK_END)
end
end
end
end
def quote_field(field)
field = String(field)
encoded_quote_character = @quote_character.encode(field.encoding)
encoded_quote_character +
field.gsub(encoded_quote_character,
encoded_quote_character * 2) +
encoded_quote_character
end
def quote(field, i)
if @force_quotes
quote_field(field)
elsif @force_quotes_fields and @force_quotes_fields[i]
quote_field(field)
else
if field.nil? # represent +nil+ fields as empty unquoted fields
""
else
field = String(field) # Stringify fields
# represent empty fields as empty quoted fields
if (@quote_empty and field.empty?) or (field.valid_encoding? and @quotable_pattern.match?(field))
quote_field(field)
else
field # unquoted field
end
end
end
end
end
end
csv-3.3.4/lib/csv/core_ext/ 0000755 0000041 0000041 00000000000 15000146530 015507 5 ustar www-data www-data csv-3.3.4/lib/csv/core_ext/string.rb 0000644 0000041 0000041 00000000302 15000146530 017335 0 ustar www-data www-data class String
# Equivalent to CSV::parse_line(self, options)
#
# "CSV,data".parse_csv
# #=> ["CSV", "data"]
def parse_csv(**options)
CSV.parse_line(self, **options)
end
end
csv-3.3.4/lib/csv/core_ext/array.rb 0000644 0000041 0000041 00000000303 15000146530 017146 0 ustar www-data www-data class Array
# Equivalent to CSV::generate_line(self, options)
#
# ["CSV", "data"].to_csv
# #=> "CSV,data\n"
def to_csv(**options)
CSV.generate_line(self, **options)
end
end
csv-3.3.4/lib/csv/version.rb 0000644 0000041 0000041 00000000153 15000146530 015710 0 ustar www-data www-data # frozen_string_literal: true
class CSV
# The version of the installed library.
VERSION = "3.3.4"
end
csv-3.3.4/lib/csv/input_record_separator.rb 0000644 0000041 0000041 00000000425 15000146530 021002 0 ustar www-data www-data require "English"
require "stringio"
class CSV
module InputRecordSeparator
class << self
if RUBY_VERSION >= "3.0.0"
def value
"\n"
end
else
def value
$INPUT_RECORD_SEPARATOR
end
end
end
end
end
csv-3.3.4/lib/csv/parser.rb 0000644 0000041 0000041 00000112403 15000146530 015521 0 ustar www-data www-data # frozen_string_literal: true
require "strscan"
require_relative "input_record_separator"
require_relative "row"
require_relative "table"
class CSV
# Note: Don't use this class directly. This is an internal class.
class Parser
#
# A CSV::Parser is m17n aware. The parser works in the Encoding of the IO
# or String object being read from or written to. Your data is never transcoded
# (unless you ask Ruby to transcode it for you) and will literally be parsed in
# the Encoding it is in. Thus CSV will return Arrays or Rows of Strings in the
# Encoding of your data. This is accomplished by transcoding the parser itself
# into your Encoding.
#
class << self
ARGF_OBJECT_ID = ARGF.object_id
# Convenient method to check whether the give input reached EOF
# or not.
def eof?(input)
# We can't use input != ARGF in Ractor. Because ARGF isn't a
# shareable object.
input.object_id != ARGF_OBJECT_ID and
input.respond_to?(:eof) and
input.eof?
end
end
# Raised when encoding is invalid.
class InvalidEncoding < StandardError
end
# Raised when unexpected case is happen.
class UnexpectedError < StandardError
end
#
# CSV::Scanner receives a CSV output, scans it and return the content.
# It also controls the life cycle of the object with its methods +keep_start+,
# +keep_end+, +keep_back+, +keep_drop+.
#
# Uses StringScanner (the official strscan gem). Strscan provides lexical
# scanning operations on a String. We inherit its object and take advantage
# on the methods. For more information, please visit:
# https://ruby-doc.org/stdlib-2.6.1/libdoc/strscan/rdoc/StringScanner.html
#
class Scanner < StringScanner
alias_method :scan_all, :scan
def initialize(*args)
super
@keeps = []
end
def each_line(row_separator)
position = pos
rest.each_line(row_separator) do |line|
position += line.bytesize
self.pos = position
yield(line)
end
end
def keep_start
@keeps.push(pos)
end
def keep_end
start = @keeps.pop
string.byteslice(start, pos - start)
end
def keep_back
self.pos = @keeps.pop
end
def keep_drop
@keeps.pop
end
end
#
# CSV::InputsScanner receives IO inputs, encoding and the chunk_size.
# It also controls the life cycle of the object with its methods +keep_start+,
# +keep_end+, +keep_back+, +keep_drop+.
#
# CSV::InputsScanner.scan() tries to match with pattern at the current position.
# If there's a match, the scanner advances the "scan pointer" and returns the matched string.
# Otherwise, the scanner returns nil.
#
# CSV::InputsScanner.rest() returns the "rest" of the string (i.e. everything after the scan pointer).
# If there is no more data (eos? = true), it returns "".
#
class InputsScanner
def initialize(inputs, encoding, row_separator, chunk_size: 8192)
@inputs = inputs.dup
@encoding = encoding
@row_separator = row_separator
@chunk_size = chunk_size
@last_scanner = @inputs.empty?
@keeps = []
read_chunk
end
def each_line(row_separator)
return enum_for(__method__, row_separator) unless block_given?
buffer = nil
input = @scanner.rest
position = @scanner.pos
offset = 0
n_row_separator_chars = row_separator.size
# trace(__method__, :start, input)
while true
input.each_line(row_separator) do |line|
@scanner.pos += line.bytesize
if buffer
if n_row_separator_chars == 2 and
buffer.end_with?(row_separator[0]) and
line.start_with?(row_separator[1])
buffer << line[0]
line = line[1..-1]
position += buffer.bytesize + offset
@scanner.pos = position
offset = 0
yield(buffer)
buffer = nil
next if line.empty?
else
buffer << line
line = buffer
buffer = nil
end
end
if line.end_with?(row_separator)
position += line.bytesize + offset
@scanner.pos = position
offset = 0
yield(line)
else
buffer = line
end
end
break unless read_chunk
input = @scanner.rest
position = @scanner.pos
offset = -buffer.bytesize if buffer
end
yield(buffer) if buffer
end
def scan(pattern)
# trace(__method__, pattern, :start)
value = @scanner.scan(pattern)
# trace(__method__, pattern, :done, :last, value) if @last_scanner
return value if @last_scanner
read_chunk if value and @scanner.eos?
# trace(__method__, pattern, :done, value)
value
end
def scan_all(pattern)
# trace(__method__, pattern, :start)
value = @scanner.scan(pattern)
# trace(__method__, pattern, :done, :last, value) if @last_scanner
return value if @last_scanner
# trace(__method__, pattern, :done, :nil) if value.nil?
return nil if value.nil?
while @scanner.eos? and read_chunk and (sub_value = @scanner.scan(pattern))
# trace(__method__, pattern, :sub, sub_value)
value << sub_value
end
# trace(__method__, pattern, :done, value)
value
end
def eos?
@scanner.eos?
end
def keep_start
# trace(__method__, :start)
adjust_last_keep
@keeps.push([@scanner, @scanner.pos, nil])
# trace(__method__, :done)
end
def keep_end
# trace(__method__, :start)
scanner, start, buffer = @keeps.pop
if scanner == @scanner
keep = @scanner.string.byteslice(start, @scanner.pos - start)
else
keep = @scanner.string.byteslice(0, @scanner.pos)
end
if buffer
buffer << keep
keep = buffer
end
# trace(__method__, :done, keep)
keep
end
def keep_back
# trace(__method__, :start)
scanner, start, buffer = @keeps.pop
if buffer
# trace(__method__, :rescan, start, buffer)
string = @scanner.string
if scanner == @scanner
keep = string.byteslice(start,
string.bytesize - @scanner.pos - start)
else
keep = string
end
if keep and not keep.empty?
@inputs.unshift(StringIO.new(keep))
@last_scanner = false
end
@scanner = StringScanner.new(buffer)
else
if @scanner != scanner
message = "scanners are different but no buffer: "
message += "#{@scanner.inspect}(#{@scanner.object_id}): "
message += "#{scanner.inspect}(#{scanner.object_id})"
raise UnexpectedError, message
end
# trace(__method__, :repos, start, buffer)
@scanner.pos = start
last_scanner, last_start, last_buffer = @keeps.last
# Drop the last buffer when the last buffer is the same data
# in the last keep. If we keep it, we have duplicated data
# by the next keep_back.
if last_scanner == @scanner and
last_buffer and
last_buffer == last_scanner.string.byteslice(last_start, start)
@keeps.last[2] = nil
end
end
read_chunk if @scanner.eos?
end
def keep_drop
_, _, buffer = @keeps.pop
# trace(__method__, :done, :empty) unless buffer
return unless buffer
last_keep = @keeps.last
# trace(__method__, :done, :no_last_keep) unless last_keep
return unless last_keep
if last_keep[2]
last_keep[2] << buffer
else
last_keep[2] = buffer
end
# trace(__method__, :done)
end
def rest
@scanner.rest
end
def check(pattern)
@scanner.check(pattern)
end
private
def trace(*args)
pp([*args, @scanner, @scanner&.string, @scanner&.pos, @keeps])
end
def adjust_last_keep
# trace(__method__, :start)
keep = @keeps.last
# trace(__method__, :done, :empty) if keep.nil?
return if keep.nil?
scanner, start, buffer = keep
string = @scanner.string
if @scanner != scanner
start = 0
end
if start == 0 and @scanner.eos?
keep_data = string
else
keep_data = string.byteslice(start, @scanner.pos - start)
end
if keep_data
if buffer
buffer << keep_data
else
keep[2] = keep_data.dup
end
end
# trace(__method__, :done)
end
def read_chunk
return false if @last_scanner
adjust_last_keep
input = @inputs.first
case input
when StringIO
string = input.read
raise InvalidEncoding unless string.valid_encoding?
# trace(__method__, :stringio, string)
@scanner = StringScanner.new(string)
@inputs.shift
@last_scanner = @inputs.empty?
true
else
chunk = input.gets(@row_separator, @chunk_size)
if chunk
raise InvalidEncoding unless chunk.valid_encoding?
# trace(__method__, :chunk, chunk)
@scanner = StringScanner.new(chunk)
if Parser.eof?(input)
@inputs.shift
@last_scanner = @inputs.empty?
end
true
else
# trace(__method__, :no_chunk)
@scanner = StringScanner.new("".encode(@encoding))
@inputs.shift
@last_scanner = @inputs.empty?
if @last_scanner
false
else
read_chunk
end
end
end
end
end
def initialize(input, options)
@input = input
@options = options
@samples = []
prepare
end
def column_separator
@column_separator
end
def row_separator
@row_separator
end
def quote_character
@quote_character
end
def field_size_limit
@max_field_size&.succ
end
def max_field_size
@max_field_size
end
def skip_lines
@skip_lines
end
def unconverted_fields?
@unconverted_fields
end
def headers
@headers
end
def header_row?
@use_headers and @headers.nil?
end
def return_headers?
@return_headers
end
def skip_blanks?
@skip_blanks
end
def liberal_parsing?
@liberal_parsing
end
def lineno
@lineno
end
def line
last_line
end
def parse(&block)
return to_enum(__method__) unless block_given?
if @return_headers and @headers and @raw_headers
headers = Row.new(@headers, @raw_headers, true)
if @unconverted_fields
headers = add_unconverted_fields(headers, [])
end
yield headers
end
begin
@scanner ||= build_scanner
__send__(@parse_method, &block)
rescue InvalidEncoding
if @scanner
ignore_broken_line
lineno = @lineno
else
lineno = @lineno + 1
end
raise InvalidEncodingError.new(@encoding, lineno)
rescue UnexpectedError => error
if @scanner
ignore_broken_line
lineno = @lineno
else
lineno = @lineno + 1
end
message = "This should not be happen: #{error.message}: "
message += "Please report this to https://github.com/ruby/csv/issues"
raise MalformedCSVError.new(message, lineno)
end
end
def use_headers?
@use_headers
end
private
# A set of tasks to prepare the file in order to parse it
def prepare
prepare_variable
prepare_quote_character
prepare_backslash
prepare_skip_lines
prepare_strip
prepare_separators
validate_strip_and_col_sep_options
prepare_quoted
prepare_unquoted
prepare_line
prepare_header
prepare_parser
end
def prepare_variable
@encoding = @options[:encoding]
liberal_parsing = @options[:liberal_parsing]
if liberal_parsing
@liberal_parsing = true
if liberal_parsing.is_a?(Hash)
@double_quote_outside_quote =
liberal_parsing[:double_quote_outside_quote]
@backslash_quote = liberal_parsing[:backslash_quote]
else
@double_quote_outside_quote = false
@backslash_quote = false
end
else
@liberal_parsing = false
@backslash_quote = false
end
@unconverted_fields = @options[:unconverted_fields]
@max_field_size = @options[:max_field_size]
@skip_blanks = @options[:skip_blanks]
@fields_converter = @options[:fields_converter]
@header_fields_converter = @options[:header_fields_converter]
end
def prepare_quote_character
@quote_character = @options[:quote_character]
if @quote_character.nil?
@escaped_quote_character = nil
@escaped_quote = nil
else
@quote_character = @quote_character.to_s.encode(@encoding)
if @quote_character.length != 1
message = ":quote_char has to be nil or a single character String"
raise ArgumentError, message
end
@escaped_quote_character = Regexp.escape(@quote_character)
@escaped_quote = Regexp.new(@escaped_quote_character)
end
end
def prepare_backslash
return unless @backslash_quote
@backslash_character = "\\".encode(@encoding)
@escaped_backslash_character = Regexp.escape(@backslash_character)
@escaped_backslash = Regexp.new(@escaped_backslash_character)
if @quote_character.nil?
@backslash_quote_character = nil
else
@backslash_quote_character =
@backslash_character + @escaped_quote_character
end
end
def prepare_skip_lines
skip_lines = @options[:skip_lines]
case skip_lines
when String
@skip_lines = skip_lines.encode(@encoding)
when Regexp, nil
@skip_lines = skip_lines
else
unless skip_lines.respond_to?(:match)
message =
":skip_lines has to respond to \#match: #{skip_lines.inspect}"
raise ArgumentError, message
end
@skip_lines = skip_lines
end
end
def prepare_strip
@strip = @options[:strip]
@escaped_strip = nil
@strip_value = nil
@rstrip_value = nil
if @strip.is_a?(String)
case @strip.length
when 0
raise ArgumentError, ":strip must not be an empty String"
when 1
# ok
else
raise ArgumentError, ":strip doesn't support 2 or more characters yet"
end
@strip = @strip.encode(@encoding)
@escaped_strip = Regexp.escape(@strip)
if @quote_character
@strip_value = Regexp.new(@escaped_strip +
"+".encode(@encoding))
@rstrip_value = Regexp.new(@escaped_strip +
"+\\z".encode(@encoding))
end
elsif @strip
strip_values = " \t\f\v"
@escaped_strip = strip_values.encode(@encoding)
if @quote_character
@strip_value = Regexp.new("[#{strip_values}]+".encode(@encoding))
@rstrip_value = Regexp.new("[#{strip_values}]+\\z".encode(@encoding))
end
end
end
begin
StringScanner.new("x").scan("x")
rescue TypeError
STRING_SCANNER_SCAN_ACCEPT_STRING = false
else
STRING_SCANNER_SCAN_ACCEPT_STRING = true
end
def prepare_separators
column_separator = @options[:column_separator]
@column_separator = column_separator.to_s.encode(@encoding)
if @column_separator.size < 1
message = ":col_sep must be 1 or more characters: "
message += column_separator.inspect
raise ArgumentError, message
end
@row_separator =
resolve_row_separator(@options[:row_separator]).encode(@encoding)
@escaped_column_separator = Regexp.escape(@column_separator)
@escaped_first_column_separator = Regexp.escape(@column_separator[0])
if @column_separator.size > 1
@column_end = Regexp.new(@escaped_column_separator)
@column_ends = @column_separator.each_char.collect do |char|
Regexp.new(Regexp.escape(char))
end
@first_column_separators = Regexp.new(@escaped_first_column_separator +
"+".encode(@encoding))
else
if STRING_SCANNER_SCAN_ACCEPT_STRING
@column_end = @column_separator
else
@column_end = Regexp.new(@escaped_column_separator)
end
@column_ends = nil
@first_column_separators = nil
end
escaped_row_separator = Regexp.escape(@row_separator)
@row_end = Regexp.new(escaped_row_separator)
if @row_separator.size > 1
@row_ends = @row_separator.each_char.collect do |char|
Regexp.new(Regexp.escape(char))
end
else
@row_ends = nil
end
@cr = "\r".encode(@encoding)
@lf = "\n".encode(@encoding)
@line_end = Regexp.new("\r\n|\n|\r".encode(@encoding))
@not_line_end = Regexp.new("[^\r\n]+".encode(@encoding))
end
# This method verifies that there are no (obvious) ambiguities with the
# provided +col_sep+ and +strip+ parsing options. For example, if +col_sep+
# and +strip+ were both equal to +\t+, then there would be no clear way to
# parse the input.
def validate_strip_and_col_sep_options
return unless @strip
if @strip.is_a?(String)
if @column_separator.start_with?(@strip) || @column_separator.end_with?(@strip)
raise ArgumentError,
"The provided strip (#{@escaped_strip}) and " \
"col_sep (#{@escaped_column_separator}) options are incompatible."
end
else
if Regexp.new("\\A[#{@escaped_strip}]|[#{@escaped_strip}]\\z").match?(@column_separator)
raise ArgumentError,
"The provided strip (true) and " \
"col_sep (#{@escaped_column_separator}) options are incompatible."
end
end
end
def prepare_quoted
if @quote_character
@quotes = Regexp.new(@escaped_quote_character +
"+".encode(@encoding))
no_quoted_values = @escaped_quote_character.dup
if @backslash_quote
no_quoted_values << @escaped_backslash_character
end
@quoted_value = Regexp.new("[^".encode(@encoding) +
no_quoted_values +
"]+".encode(@encoding))
end
if @escaped_strip
@split_column_separator = Regexp.new(@escaped_strip +
"*".encode(@encoding) +
@escaped_column_separator +
@escaped_strip +
"*".encode(@encoding))
else
if @column_separator == " ".encode(@encoding)
@split_column_separator = Regexp.new(@escaped_column_separator)
else
@split_column_separator = @column_separator
end
end
end
def prepare_unquoted
return if @quote_character.nil?
no_unquoted_values = "\r\n".encode(@encoding)
no_unquoted_values << @escaped_first_column_separator
unless @liberal_parsing
no_unquoted_values << @escaped_quote_character
end
@unquoted_value = Regexp.new("[^".encode(@encoding) +
no_unquoted_values +
"]+".encode(@encoding))
end
def resolve_row_separator(separator)
if separator == :auto
cr = "\r".encode(@encoding)
lf = "\n".encode(@encoding)
if @input.is_a?(StringIO)
pos = @input.pos
separator = detect_row_separator(@input.read, cr, lf)
@input.seek(pos)
elsif @input.respond_to?(:gets)
if @input.is_a?(File)
chunk_size = 32 * 1024
else
chunk_size = 1024
end
begin
while separator == :auto
#
# if we run out of data, it's probably a single line
# (ensure will set default value)
#
break unless sample = @input.gets(nil, chunk_size)
# extend sample if we're unsure of the line ending
if sample.end_with?(cr)
sample << (@input.gets(nil, 1) || "")
end
@samples << sample
separator = detect_row_separator(sample, cr, lf)
end
rescue IOError
# do nothing: ensure will set default
end
end
separator = InputRecordSeparator.value if separator == :auto
end
separator.to_s.encode(@encoding)
end
def detect_row_separator(sample, cr, lf)
lf_index = sample.index(lf)
if lf_index
cr_index = sample[0, lf_index].index(cr)
else
cr_index = sample.index(cr)
end
if cr_index and lf_index
if cr_index + 1 == lf_index
cr + lf
elsif cr_index < lf_index
cr
else
lf
end
elsif cr_index
cr
elsif lf_index
lf
else
:auto
end
end
def prepare_line
@lineno = 0
@last_line = nil
@scanner = nil
end
def last_line
if @scanner
@last_line ||= @scanner.keep_end
else
@last_line
end
end
def prepare_header
@return_headers = @options[:return_headers]
headers = @options[:headers]
case headers
when Array
@raw_headers = headers
quoted_fields = FieldsConverter::NO_QUOTED_FIELDS
@use_headers = true
when String
@raw_headers, quoted_fields = parse_headers(headers)
@use_headers = true
when nil, false
@raw_headers = nil
@use_headers = false
else
@raw_headers = nil
@use_headers = true
end
if @raw_headers
@headers = adjust_headers(@raw_headers, quoted_fields)
else
@headers = nil
end
end
def parse_headers(row)
quoted_fields = []
converter = lambda do |field, info|
quoted_fields << info.quoted?
field
end
headers = CSV.parse_line(row,
col_sep: @column_separator,
row_sep: @row_separator,
quote_char: @quote_character,
converters: [converter])
[headers, quoted_fields]
end
def adjust_headers(headers, quoted_fields)
adjusted_headers = @header_fields_converter.convert(headers, nil, @lineno, quoted_fields)
adjusted_headers.each {|h| h.freeze if h.is_a? String}
adjusted_headers
end
def prepare_parser
@may_quoted = may_quoted?
if @quote_character.nil?
@parse_method = :parse_no_quote
elsif @liberal_parsing or @strip
@parse_method = :parse_quotable_robust
else
@parse_method = :parse_quotable_loose
end
end
def may_quoted?
return false if @quote_character.nil?
if @input.is_a?(StringIO)
pos = @input.pos
sample = @input.read
@input.seek(pos)
else
return false if @samples.empty?
sample = @samples.first
end
sample[0, 128].index(@quote_character)
end
class UnoptimizedStringIO # :nodoc:
def initialize(string)
@io = StringIO.new(string, "rb:#{string.encoding}")
end
def gets(*args)
@io.gets(*args)
end
def each_line(*args, &block)
@io.each_line(*args, &block)
end
def eof?
@io.eof?
end
end
SCANNER_TEST = (ENV["CSV_PARSER_SCANNER_TEST"] == "yes")
if SCANNER_TEST
SCANNER_TEST_CHUNK_SIZE_NAME = "CSV_PARSER_SCANNER_TEST_CHUNK_SIZE"
SCANNER_TEST_CHUNK_SIZE_VALUE = ENV[SCANNER_TEST_CHUNK_SIZE_NAME]
def build_scanner
inputs = @samples.collect do |sample|
UnoptimizedStringIO.new(sample)
end
if @input.is_a?(StringIO)
inputs << UnoptimizedStringIO.new(@input.read)
else
inputs << @input
end
begin
chunk_size_value = ENV[SCANNER_TEST_CHUNK_SIZE_NAME]
rescue # Ractor::IsolationError
# Ractor on Ruby 3.0 can't read ENV value.
chunk_size_value = SCANNER_TEST_CHUNK_SIZE_VALUE
end
chunk_size = Integer((chunk_size_value || "1"), 10)
InputsScanner.new(inputs,
@encoding,
@row_separator,
chunk_size: chunk_size)
end
else
def build_scanner
string = nil
if @samples.empty? and @input.is_a?(StringIO)
string = @input.read
elsif @samples.size == 1 and Parser.eof?(@input)
string = @samples[0]
end
if string
unless string.valid_encoding?
index = string.lines(@row_separator).index do |line|
!line.valid_encoding?
end
if index
raise InvalidEncodingError.new(@encoding, @lineno + index + 1)
end
end
Scanner.new(string)
else
inputs = @samples.collect do |sample|
StringIO.new(sample)
end
inputs << @input
InputsScanner.new(inputs, @encoding, @row_separator)
end
end
end
def skip_needless_lines
return unless @skip_lines
until @scanner.eos?
@scanner.keep_start
line = @scanner.scan_all(@not_line_end) || "".encode(@encoding)
line << @row_separator if parse_row_end
if skip_line?(line)
@lineno += 1
@scanner.keep_drop
else
@scanner.keep_back
return
end
end
end
def skip_line?(line)
line = line.delete_suffix(@row_separator)
case @skip_lines
when String
line.include?(@skip_lines)
when Regexp
@skip_lines.match?(line)
else
@skip_lines.match(line)
end
end
def validate_field_size(field)
return unless @max_field_size
return if field.size <= @max_field_size
ignore_broken_line
message = "Field size exceeded: #{field.size} > #{@max_field_size}"
raise MalformedCSVError.new(message, @lineno)
end
def parse_no_quote(&block)
@scanner.each_line(@row_separator) do |line|
next if @skip_lines and skip_line?(line)
original_line = line
line = line.delete_suffix(@row_separator)
if line.empty?
next if @skip_blanks
row = []
else
line = strip_value(line)
row = line.split(@split_column_separator, -1)
if @max_field_size
row.each do |column|
validate_field_size(column)
end
end
n_columns = row.size
i = 0
while i < n_columns
row[i] = nil if row[i].empty?
i += 1
end
end
@last_line = original_line
emit_row(row, &block)
end
end
def parse_quotable_loose(&block)
@scanner.keep_start
@scanner.each_line(@row_separator) do |line|
if @skip_lines and skip_line?(line)
@scanner.keep_drop
@scanner.keep_start
next
end
original_line = line
line = line.delete_suffix(@row_separator)
if line.empty?
if @skip_blanks
@scanner.keep_drop
@scanner.keep_start
next
end
row = []
quoted_fields = FieldsConverter::NO_QUOTED_FIELDS
elsif line.include?(@cr) or line.include?(@lf)
@scanner.keep_back
@parse_method = :parse_quotable_robust
return parse_quotable_robust(&block)
else
row = line.split(@split_column_separator, -1)
quoted_fields = []
n_columns = row.size
i = 0
while i < n_columns
column = row[i]
if column.empty?
quoted_fields << false
row[i] = nil
else
n_quotes = column.count(@quote_character)
if n_quotes.zero?
quoted_fields << false
# no quote
elsif n_quotes == 2 and
column.start_with?(@quote_character) and
column.end_with?(@quote_character)
quoted_fields << true
row[i] = column[1..-2]
else
@scanner.keep_back
@parse_method = :parse_quotable_robust
return parse_quotable_robust(&block)
end
validate_field_size(row[i])
end
i += 1
end
end
@scanner.keep_drop
@scanner.keep_start
@last_line = original_line
emit_row(row, quoted_fields, &block)
end
@scanner.keep_drop
end
def parse_quotable_robust(&block)
row = []
quoted_fields = []
skip_needless_lines
start_row
while true
@quoted_column_value = false
@unquoted_column_value = false
@scanner.scan_all(@strip_value) if @strip_value
value = parse_column_value
if value
@scanner.scan_all(@strip_value) if @strip_value
validate_field_size(value)
end
if parse_column_end
row << value
quoted_fields << @quoted_column_value
elsif parse_row_end
if row.empty? and value.nil?
emit_row([], &block) unless @skip_blanks
else
row << value
quoted_fields << @quoted_column_value
emit_row(row, quoted_fields, &block)
row = []
quoted_fields.clear
end
skip_needless_lines
start_row
elsif @scanner.eos?
break if row.empty? and value.nil?
row << value
quoted_fields << @quoted_column_value
emit_row(row, quoted_fields, &block)
break
else
if @quoted_column_value
if liberal_parsing? and (new_line = @scanner.check(@line_end))
message =
"Illegal end-of-line sequence outside of a quoted field " +
"<#{new_line.inspect}>"
else
message = "Any value after quoted field isn't allowed"
end
ignore_broken_line
raise MalformedCSVError.new(message, @lineno)
elsif @unquoted_column_value and
(new_line = @scanner.scan(@line_end))
ignore_broken_line
message = "Unquoted fields do not allow new line " +
"<#{new_line.inspect}>"
raise MalformedCSVError.new(message, @lineno)
elsif @scanner.rest.start_with?(@quote_character)
ignore_broken_line
message = "Illegal quoting"
raise MalformedCSVError.new(message, @lineno)
elsif (new_line = @scanner.scan(@line_end))
ignore_broken_line
message = "New line must be <#{@row_separator.inspect}> " +
"not <#{new_line.inspect}>"
raise MalformedCSVError.new(message, @lineno)
else
ignore_broken_line
raise MalformedCSVError.new("TODO: Meaningful message",
@lineno)
end
end
end
end
def parse_column_value
if @liberal_parsing
quoted_value = parse_quoted_column_value
if quoted_value
@scanner.scan_all(@strip_value) if @strip_value
unquoted_value = parse_unquoted_column_value
if unquoted_value
if @double_quote_outside_quote
unquoted_value = unquoted_value.gsub(@quote_character * 2,
@quote_character)
if quoted_value.empty? # %Q{""...} case
return @quote_character + unquoted_value
end
end
@quote_character + quoted_value + @quote_character + unquoted_value
else
quoted_value
end
else
parse_unquoted_column_value
end
elsif @may_quoted
parse_quoted_column_value ||
parse_unquoted_column_value
else
parse_unquoted_column_value ||
parse_quoted_column_value
end
end
def parse_unquoted_column_value
value = @scanner.scan_all(@unquoted_value)
return nil unless value
@unquoted_column_value = true
if @first_column_separators
while true
@scanner.keep_start
is_column_end = @column_ends.all? do |column_end|
@scanner.scan(column_end)
end
@scanner.keep_back
break if is_column_end
sub_separator = @scanner.scan_all(@first_column_separators)
break if sub_separator.nil?
value << sub_separator
sub_value = @scanner.scan_all(@unquoted_value)
break if sub_value.nil?
value << sub_value
end
end
value.gsub!(@backslash_quote_character, @quote_character) if @backslash_quote
if @rstrip_value
value.gsub!(@rstrip_value, "")
end
value
end
def parse_quoted_column_value
quotes = @scanner.scan_all(@quotes)
return nil unless quotes
@quoted_column_value = true
n_quotes = quotes.size
if (n_quotes % 2).zero?
quotes[0, (n_quotes - 2) / 2]
else
value = quotes[0, n_quotes / 2]
while true
quoted_value = @scanner.scan_all(@quoted_value)
value << quoted_value if quoted_value
if @backslash_quote
if @scanner.scan(@escaped_backslash)
if @scanner.scan(@escaped_quote)
value << @quote_character
else
value << @backslash_character
end
next
end
end
quotes = @scanner.scan_all(@quotes)
unless quotes
ignore_broken_line
message = "Unclosed quoted field"
raise MalformedCSVError.new(message, @lineno)
end
n_quotes = quotes.size
if n_quotes == 1
break
else
value << quotes[0, n_quotes / 2]
break if (n_quotes % 2) == 1
end
end
value
end
end
def parse_column_end
return true if @scanner.scan(@column_end)
return false unless @column_ends
@scanner.keep_start
if @column_ends.all? {|column_end| @scanner.scan(column_end)}
@scanner.keep_drop
true
else
@scanner.keep_back
false
end
end
def parse_row_end
return true if @scanner.scan(@row_end)
return false unless @row_ends
@scanner.keep_start
if @row_ends.all? {|row_end| @scanner.scan(row_end)}
@scanner.keep_drop
true
else
@scanner.keep_back
false
end
end
def strip_value(value)
return value unless @strip
return value if value.nil?
case @strip
when String
while value.delete_prefix!(@strip)
# do nothing
end
while value.delete_suffix!(@strip)
# do nothing
end
else
value.strip!
end
value
end
def ignore_broken_line
@scanner.scan_all(@not_line_end)
@scanner.scan_all(@line_end)
@lineno += 1
end
def start_row
if @last_line
@last_line = nil
else
@scanner.keep_drop
end
@scanner.keep_start
end
def emit_row(row, quoted_fields=FieldsConverter::NO_QUOTED_FIELDS, &block)
@lineno += 1
raw_row = row
if @use_headers
if @headers.nil?
@headers = adjust_headers(row, quoted_fields)
return unless @return_headers
row = Row.new(@headers, row, true)
else
row = Row.new(@headers,
@fields_converter.convert(raw_row, @headers, @lineno, quoted_fields))
end
else
# convert fields, if needed...
row = @fields_converter.convert(raw_row, nil, @lineno, quoted_fields)
end
# inject unconverted fields and accessor, if requested...
if @unconverted_fields and not row.respond_to?(:unconverted_fields)
add_unconverted_fields(row, raw_row)
end
yield(row)
end
# This method injects an instance variable unconverted_fields into
# +row+ and an accessor method for +row+ called unconverted_fields(). The
# variable is set to the contents of +fields+.
def add_unconverted_fields(row, fields)
class << row
attr_reader :unconverted_fields
end
row.instance_variable_set(:@unconverted_fields, fields)
row
end
end
end
csv-3.3.4/lib/csv/fields_converter.rb 0000644 0000041 0000041 00000005305 15000146530 017564 0 ustar www-data www-data # frozen_string_literal: true
class CSV
# Note: Don't use this class directly. This is an internal class.
class FieldsConverter
include Enumerable
NO_QUOTED_FIELDS = [] # :nodoc:
def NO_QUOTED_FIELDS.[](_index)
false
end
NO_QUOTED_FIELDS.freeze
#
# A CSV::FieldsConverter is a data structure for storing the
# fields converter properties to be passed as a parameter
# when parsing a new file (e.g. CSV::Parser.new(@io, parser_options))
#
def initialize(options={})
@converters = []
@nil_value = options[:nil_value]
@empty_value = options[:empty_value]
@empty_value_is_empty_string = (@empty_value == "")
@accept_nil = options[:accept_nil]
@builtin_converters_name = options[:builtin_converters_name]
@need_static_convert = need_static_convert?
end
def add_converter(name=nil, &converter)
if name.nil? # custom converter
@converters << converter
else # named converter
combo = builtin_converters[name]
case combo
when Array # combo converter
combo.each do |sub_name|
add_converter(sub_name)
end
else # individual named converter
@converters << combo
end
end
end
def each(&block)
@converters.each(&block)
end
def empty?
@converters.empty?
end
def convert(fields, headers, lineno, quoted_fields=NO_QUOTED_FIELDS)
return fields unless need_convert?
fields.collect.with_index do |field, index|
if field.nil?
field = @nil_value
elsif field.is_a?(String) and field.empty?
field = @empty_value unless @empty_value_is_empty_string
end
@converters.each do |converter|
break if field.nil? and @accept_nil
if converter.arity == 1 # straight field converter
field = converter[field]
else # FieldInfo converter
if headers
header = headers[index]
else
header = nil
end
quoted = quoted_fields[index]
field = converter[field, FieldInfo.new(index, lineno, header, quoted)]
end
break unless field.is_a?(String) # short-circuit pipeline for speed
end
field # final state of each field, converted or original
end
end
private
def need_static_convert?
not (@nil_value.nil? and @empty_value_is_empty_string)
end
def need_convert?
@need_static_convert or
(not @converters.empty?)
end
def builtin_converters
@builtin_converters ||= ::CSV.const_get(@builtin_converters_name)
end
end
end
csv-3.3.4/LICENSE.txt 0000644 0000041 0000041 00000003562 15000146530 014167 0 ustar www-data www-data Copyright (C) 2005-2016 James Edward Gray II. All rights reserved.
Copyright (C) 2007-2017 Yukihiro Matsumoto. All rights reserved.
Copyright (C) 2017 SHIBATA Hiroshi. All rights reserved.
Copyright (C) 2017 Olivier Lacan. All rights reserved.
Copyright (C) 2017 Espartaco Palma. All rights reserved.
Copyright (C) 2017 Marcus Stollsteimer. All rights reserved.
Copyright (C) 2017 pavel. All rights reserved.
Copyright (C) 2017-2018 Steven Daniels. All rights reserved.
Copyright (C) 2018 Tomohiro Ogoke. All rights reserved.
Copyright (C) 2018 Kouhei Sutou. All rights reserved.
Copyright (C) 2018 Mitsutaka Mimura. All rights reserved.
Copyright (C) 2018 Vladislav. All rights reserved.
Redistribution and use in source and binary forms, with or without
modification, are permitted provided that the following conditions
are met:
1. Redistributions of source code must retain the above copyright
notice, this list of conditions and the following disclaimer.
2. Redistributions in binary form must reproduce the above copyright
notice, this list of conditions and the following disclaimer in the
documentation and/or other materials provided with the distribution.
THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND
ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE
FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
SUCH DAMAGE.
csv-3.3.4/README.md 0000644 0000041 0000041 00000003126 15000146530 013617 0 ustar www-data www-data # CSV
This library provides a complete interface to CSV files and data. It offers tools to enable you to read and write to and from Strings or IO objects, as needed.
## Installation
Add this line to your application's Gemfile:
```ruby
gem 'csv'
```
And then execute:
$ bundle
Or install it yourself as:
$ gem install csv
## Usage
```ruby
require "csv"
CSV.foreach("path/to/file.csv") do |row|
# use row here...
end
```
## Documentation
- [API](https://ruby.github.io/csv/): all classes, methods, and constants.
- [Recipes](https://ruby.github.io/csv/doc/csv/recipes/recipes_rdoc.html): specific code for specific tasks.
## Development
After checking out the repo, run `ruby run-test.rb` to check if your changes can pass the test.
To install this gem onto your local machine, run `bundle exec rake install`. To release a new version, update the version number in `version.rb`, and then run `bundle exec rake release`, which will create a git tag for the version, push git commits and tags, and push the `.gem` file to [rubygems.org](https://rubygems.org).
## Contributing
Bug reports and pull requests are welcome on GitHub at https://github.com/ruby/csv.
### NOTE: About RuboCop
We don't use RuboCop because we can manage our coding style by ourselves. We want to accept small fluctuations in our coding style because we use Ruby.
Please do not submit issues and PRs that aim to introduce RuboCop in this repository.
## License
The gem is available as open source under the terms of the [2-Clause BSD License](https://opensource.org/licenses/BSD-2-Clause).
See LICENSE.txt for details.