用 Java 解析任意日期

我知道这个问题经常被问到,显然你不能解析任何随意的日期。但是,我发现 python-dateutil 库能够解析我抛给它的每个日期,而在计算日期格式字符串方面完全不需要任何努力。Joda 时间总是作为一个优秀的 Java 日期解析器出售,但它仍然需要您在选择格式(或创建自己的格式)之前决定日期的格式。您不能仅仅调用 DateFormatter.parse (mydate)然后神奇地得到一个 Date 对象。

例如,使用 python-dateutil 正确解析日期“ Wed Mar 0405:09:06 GMT-06:002009”:

import dateutil.parser
print dateutil.parser.parse('Wed Mar 04 05:09:06 GMT-06:00 2009')

但是下面的 Joda 时间调用不起作用:

    String date = "Wed Mar 04 05:09:06 GMT-06:00 2009";
DateTimeFormatter fmt = ISODateTimeFormat.dateTime();
DateTime dt = fmt.parseDateTime(date);
System.out.println(date);

而创建您自己的 DateTimeFormatter 违背了这个目的,因为这似乎与使用带有正确格式字符串的 SimpleDateFormatter 是一样的。

在 Java 中有没有类似的解析日期的方法,比如 python-dateutil?我不在乎错误,我只是希望它基本上是完美的。

49221 次浏览

What I have seen done is a Date util class that contains several typical date formats. So, when DateUtil.parse(date) is called, it tries to parse the date with each date format internally and only throws exceptions if none of the internal formats can parse it.

It is basically a brute force approach to your problem.

Your best bet is really asking help to regex to match the date format pattern and/or to do brute forcing.

Several years ago I wrote a little silly DateUtil class which did the job. Here's an extract of relevance:

private static final Map<String, String> DATE_FORMAT_REGEXPS = new HashMap<String, String>() \{\{
put("^\\d{8}$", "yyyyMMdd");
put("^\\d{1,2}-\\d{1,2}-\\d{4}$", "dd-MM-yyyy");
put("^\\d{4}-\\d{1,2}-\\d{1,2}$", "yyyy-MM-dd");
put("^\\d{1,2}/\\d{1,2}/\\d{4}$", "MM/dd/yyyy");
put("^\\d{4}/\\d{1,2}/\\d{1,2}$", "yyyy/MM/dd");
put("^\\d{1,2}\\s[a-z]{3}\\s\\d{4}$", "dd MMM yyyy");
put("^\\d{1,2}\\s[a-z]{4,}\\s\\d{4}$", "dd MMMM yyyy");
put("^\\d{12}$", "yyyyMMddHHmm");
put("^\\d{8}\\s\\d{4}$", "yyyyMMdd HHmm");
put("^\\d{1,2}-\\d{1,2}-\\d{4}\\s\\d{1,2}:\\d{2}$", "dd-MM-yyyy HH:mm");
put("^\\d{4}-\\d{1,2}-\\d{1,2}\\s\\d{1,2}:\\d{2}$", "yyyy-MM-dd HH:mm");
put("^\\d{1,2}/\\d{1,2}/\\d{4}\\s\\d{1,2}:\\d{2}$", "MM/dd/yyyy HH:mm");
put("^\\d{4}/\\d{1,2}/\\d{1,2}\\s\\d{1,2}:\\d{2}$", "yyyy/MM/dd HH:mm");
put("^\\d{1,2}\\s[a-z]{3}\\s\\d{4}\\s\\d{1,2}:\\d{2}$", "dd MMM yyyy HH:mm");
put("^\\d{1,2}\\s[a-z]{4,}\\s\\d{4}\\s\\d{1,2}:\\d{2}$", "dd MMMM yyyy HH:mm");
put("^\\d{14}$", "yyyyMMddHHmmss");
put("^\\d{8}\\s\\d{6}$", "yyyyMMdd HHmmss");
put("^\\d{1,2}-\\d{1,2}-\\d{4}\\s\\d{1,2}:\\d{2}:\\d{2}$", "dd-MM-yyyy HH:mm:ss");
put("^\\d{4}-\\d{1,2}-\\d{1,2}\\s\\d{1,2}:\\d{2}:\\d{2}$", "yyyy-MM-dd HH:mm:ss");
put("^\\d{1,2}/\\d{1,2}/\\d{4}\\s\\d{1,2}:\\d{2}:\\d{2}$", "MM/dd/yyyy HH:mm:ss");
put("^\\d{4}/\\d{1,2}/\\d{1,2}\\s\\d{1,2}:\\d{2}:\\d{2}$", "yyyy/MM/dd HH:mm:ss");
put("^\\d{1,2}\\s[a-z]{3}\\s\\d{4}\\s\\d{1,2}:\\d{2}:\\d{2}$", "dd MMM yyyy HH:mm:ss");
put("^\\d{1,2}\\s[a-z]{4,}\\s\\d{4}\\s\\d{1,2}:\\d{2}:\\d{2}$", "dd MMMM yyyy HH:mm:ss");
}};


/**
* Determine SimpleDateFormat pattern matching with the given date string. Returns null if
* format is unknown. You can simply extend DateUtil with more formats if needed.
* @param dateString The date string to determine the SimpleDateFormat pattern for.
* @return The matching SimpleDateFormat pattern, or null if format is unknown.
* @see SimpleDateFormat
*/
public static String determineDateFormat(String dateString) {
for (String regexp : DATE_FORMAT_REGEXPS.keySet()) {
if (dateString.toLowerCase().matches(regexp)) {
return DATE_FORMAT_REGEXPS.get(regexp);
}
}
return null; // Unknown format.
}

(cough, double brace initialization, cough, it was just to get it all to fit in 100 char max length ;) )

You can easily expand it yourself with new regex and dateformat patterns.

There is a nice library called Natty which I think fits your purposes:

Natty is a natural language date parser written in Java. Given a date expression, natty will apply standard language recognition and translation techniques to produce a list of corresponding dates with optional parse and syntax information.

You can also try it online!

You could try dateparser.

It can recognize any String automatically, and parse it into Date, Calendar, LocalDateTime, OffsetDateTime correctly and quickly(1us~1.5us).

It doesn't based on any natural language analyzer or SimpleDateFormat or regex.Pattern.

With it, you don't have to prepare any appropriate patterns like yyyy-MM-dd'T'HH:mm:ss.SSSZ or yyyy-MM-dd'T'HH:mm:ss.SSSZZ:

Date date = DateParserUtils.parseDate("2015-04-29T10:15:00.500+0000");
Calendar calendar = DateParserUtils.parseCalendar("2015-04-29T10:15:00.500Z");
LocalDateTime dateTime = DateParserUtils.parseDateTime("2015-04-29 10:15:00.500 +00:00");

All works fine, please enjoy it.

I have no idea about this parsing how to do in python. In java we can do like this

SimpleDateFormat sdf1 = new SimpleDateFormat("dd-MM-yyyy");
java.util.Date normalDate = null;
java.sql.Date sqlDate = null;
normalDate = sdf1.parse(date);
sqlDate = new java.sql.Date(normalDate.getTime());
System.out.println(sqlDate);

i think like java some predefined functions will be there in python. You can follow this method. This methods parse the String date to Sql Date (dd-MM-yyyy);

import java.text.SimpleDateFormat;
import java.text.ParseException;
public class HelloWorld{
public static void main(String []args){
String date ="26-12-2019";
SimpleDateFormat sdf1 = new SimpleDateFormat("dd-MM-yyyy");
java.util.Date normalDate = null;
java.sql.Date sqlDate = null;
if( !date.isEmpty()) {
try {
normalDate = sdf1.parse(date);
sqlDate = new java.sql.Date(normalDate.getTime());
System.out.println(sqlDate);
} catch (ParseException e) {
}
}
}
}

execute this!

//download library:   org.ocpsoft.prettytime.nlp.PrettyTimeParser
String str = "2020.03.03";
Date date = new PrettyTimeParser().parseSyntax(str).get(0).getDates().get(0);
System.out.println(date)