Class DateConverter
- java.lang.Object
-
- org.apache.pdfbox.util.DateConverter
-
public class DateConverter extends java.lang.Object
This class is used to convert dates to strings and back using the PDF date standard in section 3.8.2 of PDF Reference 1.7.- Author:
- Ben Litchfield, Fred Hansen TODO Move members of this class elsewhere for shared use in pdfbox, xmpbox, and jempbox.
-
-
Field Summary
Fields Modifier and Type Field Description static int
INVALID_YEAR
Error value if date is invalid.
-
Method Summary
All Methods Static Methods Concrete Methods Deprecated Methods Modifier and Type Method Description static void
adjustTimeZoneNicely(java.util.GregorianCalendar cal, java.util.TimeZone tz)
Install a TimeZone on a GregorianCalendar without changing the hours value.static java.lang.String
formatTZoffset(long millis, java.lang.String sep)
Formats a time zone offset as #hh^mm where # is + or -, hh is hours, ^ is a separator, and mm is minutes.static java.lang.String[]
getFormats()
Get all know formats.static java.util.GregorianCalendar
newGreg()
Construct a new GregorianCalendar and set defaults.static java.util.GregorianCalendar
parseBigEndianDate(java.lang.String text, java.text.ParsePosition initialWhere)
Parses a big-endian date: year month day hour min sec.static java.util.Calendar
parseDate(java.lang.String text, java.lang.String[] moreFmts, java.text.ParsePosition initialWhere)
Parses a String to see if it begins with a date, and if so, returns that date.static java.util.GregorianCalendar
parseSimpleDate(java.lang.String text, java.lang.String[] fmts, java.text.ParsePosition initialWhere)
See if text can be parsed as a date according to any of a list of formats.static int
parseTimeField(java.lang.String text, java.text.ParsePosition where, int maxlen, int remedy)
Parses an integer from a string, starting at and advancing a ParsePosition.static boolean
parseTZoffset(java.lang.String text, java.util.GregorianCalendar cal, java.text.ParsePosition initialWhere)
Parses the end of a date string for a time zone and, if one is found, sets the time zone of the GregorianCalendar.static int
restrainTZoffset(long proposedOffset)
Constrain a timezone offset to the range [-14:00 thru +14:00].static char
skipOptionals(java.lang.String text, java.text.ParsePosition where, java.lang.String optionals)
Advances the ParsePosition past any and all the characters that match those in the optionals list.static boolean
skipString(java.lang.String text, java.lang.String victim, java.text.ParsePosition where)
If the victim string is at the given position in the text, this method advances the position past that string.static java.util.Calendar
toCalendar(java.lang.String text)
Deprecated.This method throws an IOException for failure.static java.util.Calendar
toCalendar(java.lang.String text, java.lang.String[] moreFmts)
Converts a string to a calendar.static java.util.Calendar
toCalendar(COSString text)
Deprecated.This method throws an IOException for failure.static java.lang.String
toISO8601(java.util.Calendar cal)
Converts the date to ISO 8601 string format: yyyy-mm-ddThh:MM:ss#hh:mm (where '#" is '+' or '-').static java.lang.String
toString(java.util.Calendar cal)
Converts a Calendar to a string formatted as: D:yyyyMMddHHmmss#hh'mm' where # is Z, +, or -.
-
-
-
Field Detail
-
INVALID_YEAR
public static final int INVALID_YEAR
Error value if date is invalid. Parsing is done with GregorianCalendar.setLenient(false), so every date field value must be within bounds. If an attempt is made to parse an invalid date field, toCalendar(String, String[]) returns Jan 1 in year INVALID_YEAR.- See Also:
- Constant Field Values
-
-
Method Detail
-
getFormats
public static java.lang.String[] getFormats()
Get all know formats.- Returns:
- an array containig all known formats
-
toString
public static java.lang.String toString(java.util.Calendar cal)
Converts a Calendar to a string formatted as: D:yyyyMMddHHmmss#hh'mm' where # is Z, +, or -.- Parameters:
cal
- The date to convert to a string. May be null. The DST_OFFSET is included when computing the output time zone.- Returns:
- The date as a String to be used in a PDF document, or null if the cal value is null
-
toISO8601
public static java.lang.String toISO8601(java.util.Calendar cal)
Converts the date to ISO 8601 string format: yyyy-mm-ddThh:MM:ss#hh:mm (where '#" is '+' or '-').- Parameters:
cal
- The date to convert. Must not be null. The DST_OFFSET is included in the output value.- Returns:
- The date represented as an ISO 8601 string.
-
restrainTZoffset
public static int restrainTZoffset(long proposedOffset)
Constrain a timezone offset to the range [-14:00 thru +14:00].- Parameters:
proposedOffset
- A value intended to be a timezone offset.- Returns:
- The corresponding value reduced to the above noted range by adding or subtracting multiples of a full day.
-
formatTZoffset
public static java.lang.String formatTZoffset(long millis, java.lang.String sep)
Formats a time zone offset as #hh^mm where # is + or -, hh is hours, ^ is a separator, and mm is minutes. Any separator may be specified by the second argument; the usual values are ":" (ISO 8601), "" (RFC 822), and "'" (PDF). The returned value is constrained to the range -11:59 ... 11:59. For offset of 0 millis, the String returned is "+00^00", never "Z". To get a "general" offset in form GMT#hh:mm, write "GMT"+DateConverter.formatTZoffset(offset, ":");Take thought in choosing the source for the millis value. It can come from calendarValue.getTimeZone() or from calendarValue.get(Calendar.ZONE_OFFSET). If a TimeZone was created from a valid time zone ID, then it may have a daylight savings rule. (As of July 4, 2013, the data base at http://www.iana.org/time-zones recognized 629 time zone regions. But a TimeZone created as new SimpleTimeZone(millisOffset, "ID"), will not have a daylight savings rule. (Not even if there is a known time zone with the given ID. To get the TimeZone named "xDT" with its DST rule, use an ID of EST5EDT, CST6CDT, MST7MDT, or PST8PDT.
When parsing PDF dates, the incoming values DOES NOT have a TIMEZONE value. At most it has an OFFSET value like -04'00'. It is generally impossible to determine what TIMEZONE corresponds to a given OFFSET. If the date is in the summer when daylight savings is in effect, an offset of -0400 might correspond to any one of the 38 regions (of 53) with standard time offset -0400 and no daylight saving. Or it might correspond to any one of the 31 regions (out of 43) that observe daylight savings and have standard time offset of -0500.
If a Calendar has not been assigned a TimeZone with setTimeZone(), it will have by default the local TIMEZONE, not just the OFFSET. In the USA, this TimeZone will have a daylight savings rule.
The offset assigned with calVal.set(Calendar.ZONE_OFFSET) differs from the offset in the TimeZone set by Calendar.setTimeZone(). Example: Suppose my local TimeZone is America/New_York. It has an offset of -05'00'. And suppose I set a GregorianCalendar's ZONE_OFFSET to -07'00' calVal = new GregorianCalendar(); // TimeZone is the local default calVal.set(Calendar.ZONE_OFFSET, -7* MILLIS_PER_HOUR); Four different offsets can be computed from calVal: calVal.get(Calendar.ZONE_OFFSET) => -07:00 calVal.get(Calendar.ZONE_OFFSET) + calVal.get(Calendar.DST_OFFSET) => -06:00 calVal.getTimeZone().getRawOffset() => -05:00 calVal.getTimeZone().getOffset(calVal.getTimeInMillis()) => -04:00
Which is correct??? I dunno, though setTimeZone() does seem to affect ZONE_OFFSET, and not vice versa. One cannot even test whether TimeZone or ZONE_OFFSET has been set; both have been set by initialization code. TimeZone is initialized to the local default time zone and ZONE_OFFSET is set from it. My choice in this DateConverter class has been to set the initial TimeZone of a GregorianCalendar to GMT. Thereafter the TimeZone is modified with
adjustTimeZoneNicely(java.util.GregorianCalendar, java.util.TimeZone)
.- Parameters:
millis
- a time zone offset expressed in milliseconds Any value is accepted; it is normalized to [-11:59 ... +11:59]sep
- a String to insert between hh and mm. May be empty.- Returns:
- the formatted String for the offset
-
parseTimeField
public static int parseTimeField(java.lang.String text, java.text.ParsePosition where, int maxlen, int remedy)
Parses an integer from a string, starting at and advancing a ParsePosition.- Parameters:
text
- The string being parsed. If null, the remedy value is returned.where
- The ParsePosition to start the search. This value will be incremented by the number of digits found, but no more than maxlen. That is, the ParsePosition will advance across at most maxlen initial digits in text. The error index is ignored and unchanged.maxlen
- The maximum length of the integer to parse. Usually 2, but 4 for year fields. If the field of length maxlen begins with a digit, but contains a non-digit, no error is signaled and the integer value is returned.remedy
- Value to be assigned if no digit is found at the initial parse position; that is, if the field is empty.- Returns:
- The integer that was at the given parse position. Or the remedy value if no digits were found.
-
skipOptionals
public static char skipOptionals(java.lang.String text, java.text.ParsePosition where, java.lang.String optionals)
Advances the ParsePosition past any and all the characters that match those in the optionals list. In particular, a space will skip all spaces.- Parameters:
text
- The text to examinewhere
- index to start looking. The value is incremented by the number of optionals found. The error index is ignored and unchanged.optionals
- A String listing all the optional characters to be skipped.- Returns:
- The last non-space character passed over. Returns a space if no non-space character was found (even if space is not in the optionals list.)
-
skipString
public static boolean skipString(java.lang.String text, java.lang.String victim, java.text.ParsePosition where)
If the victim string is at the given position in the text, this method advances the position past that string.- Parameters:
text
- The text to examinevictim
- The string to look forwhere
- The initial position to look at. After return, this will have been incremented by the length of the victim if it was found. The error index is ignored and unchanged.- Returns:
- true if victim was found; otherwise false.
-
newGreg
public static java.util.GregorianCalendar newGreg()
Construct a new GregorianCalendar and set defaults. Locale is ENGLISH. TimeZone is "UTC" (zero offset and no DST). Parsing is NOT lenient. Milliseconds are zero.- Returns:
- a new gregorian calendar
-
adjustTimeZoneNicely
public static void adjustTimeZoneNicely(java.util.GregorianCalendar cal, java.util.TimeZone tz)
Install a TimeZone on a GregorianCalendar without changing the hours value. A plain GregorianCalendat.setTimeZone() adjusts the Calendar.HOUR value to compensate. This is *BAD* (not to say *EVIL*) when we have already set the time.- Parameters:
cal
- The GregorianCalendar whose TimeZone to change.tz
- The new TimeZone.
-
parseTZoffset
public static boolean parseTZoffset(java.lang.String text, java.util.GregorianCalendar cal, java.text.ParsePosition initialWhere)
Parses the end of a date string for a time zone and, if one is found, sets the time zone of the GregorianCalendar. Otherwise the calendar time zone is unchanged. The text is parsed as (Z|GMT|UTC)? [+- ]* h [': ]? m '? where the leading String is optional, h is two digits by default, but may be a single digit if followed by one of space, apostrophe, colon, or the end of string. Similarly, m is one or two digits. This scheme accepts the format of PDF, RFC 822, and ISO8601. If none of these applies (as for a time zone name), we try TimeZone.getTimeZone().- Parameters:
text
- The text expected to begin with a time zone value, possibly with leading or trailing spaces.cal
- The Calendar whose TimeZone to set.initialWhere
- where Scanning begins at where.index. After success, the returned index is that of the next character after the recognized string. The error index is ignored and unchanged.- Returns:
- true if parsed a time zone value; otherwise the time zone is unchanged and the return value is false.
-
parseBigEndianDate
public static java.util.GregorianCalendar parseBigEndianDate(java.lang.String text, java.text.ParsePosition initialWhere)
Parses a big-endian date: year month day hour min sec. The year must be four digits. Other fields may be adjacent and delimited by length or they may follow appropriate delimiters. year [ -/]* month [ -/]* dayofmonth [ T]* hour [:] min [:] sec [.secFraction] If any numeric field is omitted, all following fields must also be omitted. No time zone is processed. Ambiguous dates can produce unexpected results. For example: 1970 12 23:08 will parse as 1970 December 23 00:08:00- Parameters:
text
- The string to parse.initialWhere
- Where to begin the parse. On return the index is advanced to just beyond the last character processed. The error index is ignored and unchanged.- Returns:
- a GregorianCalendar representing the parsed date. Or null if the text did not begin with at least four digits.
-
parseSimpleDate
public static java.util.GregorianCalendar parseSimpleDate(java.lang.String text, java.lang.String[] fmts, java.text.ParsePosition initialWhere)
See if text can be parsed as a date according to any of a list of formats. The time zone may be included as part of the format, or omitted in favor of later testing for a trailing time zone.- Parameters:
text
- The text to be parsed.fmts
- A list of formats to be tried. The syntax is that forSimpleDateFormat
initialWhere
- At start this is the position to begin examining the text. Upon return it will have been incremented to refer to the next non-space character after the date. If no date was found, the value is unchanged. The error index is ignored and unchanged.- Returns:
- null for failure to find a date, or the GregorianCalendar for the date that was found. Unless a time zone was part of the format, the time zone will be GMT+0
-
parseDate
public static java.util.Calendar parseDate(java.lang.String text, java.lang.String[] moreFmts, java.text.ParsePosition initialWhere)
Parses a String to see if it begins with a date, and if so, returns that date. The date must be strictly correct--no field may exceed the appropriate limit. (That is, the Calendar has setLenient(false).) Skips initial spaces, but does NOT check for "D:" The scan first tries parseBigEndianDate and parseTZoffset and then tries parseSimpleDate with appropriate formats, again followed by parseTZoffset. If at any stage the entire text is consumed, that date value is returned immediately. Otherwise the date that consumes the longest initial part of the text is returned. - PDF format dates are among those recognized by parseBigEndianDate. - The formats tried are alphaStartFormats or digitStartFormat and any listed in the value of moreFmts.- Parameters:
text
- The String that may begin with a date. Must not be null. Initial spaces and "D:" are skipped over.moreFmts
- Additional formats to be tried after trying the built-in formats.initialWhere
- where Parsing begins at the given position in text. If the parse succeeds, the index of where is advanced to point to the first unrecognized character. The error index is ignored and unchanged.- Returns:
- A GregorianCalendar for the date. If no date is found, returns null. The time zone will be GMT+0 unless parsing succeeded with a format containing a time zone. (Only one builtin format contains a time zone.)
-
toCalendar
public static java.util.Calendar toCalendar(COSString text) throws java.io.IOException
Deprecated.This method throws an IOException for failure. Replace calls to it withtoCalendar(String, String[])
and test for failure with (value == null || value.get(Calendar.YEAR) == INVALID_YEAR)Converts a string to a Calendar by parsing the String for a date.- Parameters:
text
- The COSString representation of a date.- Returns:
- The Calendar that the text string represents. Or null if text was null.
- Throws:
java.io.IOException
- If the date string is not in the correct format.
-
toCalendar
public static java.util.Calendar toCalendar(java.lang.String text) throws java.io.IOException
Deprecated.This method throws an IOException for failure. Replace calls to it withtoCalendar(String, String[])
usingnull
for the second parameter and test for failure with (value == null || value.get(Calendar.YEAR) == INVALID_YEAR)Converts a string date to a Calendar date value; equivalent totoCalendar(String, String[])
usingnull
for the second parameter, but throws an IOException for failure. The returned value will have 0 for DST_OFFSET.- Parameters:
text
- The string representation of the calendar.- Returns:
- The Calendar that this string represents or null if the incoming text is null.
- Throws:
java.io.IOException
- If the date string is non-null and not a parseable date.
-
toCalendar
public static java.util.Calendar toCalendar(java.lang.String text, java.lang.String[] moreFmts)
Converts a string to a calendar. The entire string must be consumed. The date must be strictly correct; that is, no field may exceed the appropriate limit. UsesparseDate(java.lang.String, java.lang.String[], java.text.ParsePosition)
to do the actual parsing. The returned value will have 0 for DST_OFFSET.- Parameters:
text
- The text to parse. Initial spaces and "D:" are skipped over.moreFmts
- An Array of formats (as Strings) to try in addition to the standard list.- Returns:
- the Calendar value corresponding to the date text. If text does not represent a valid date, the value is January 1 on year INVALID_YEAR at 0:0:0 GMT.
-
-