In a nutshell, a window is created as soon as the first element that should belong to this window arrives, and the window is completely removed when the time (event or processing time) passes its end timestamp plus the user-specified allowed lateness (see Allowed Lateness). Flink guarantees removal only for time-based windows and not for other types, e.g. global windows (see Window Assigners). For example, with an event-time-based windowing strategy that creates non-overlapping (or tumbling) windows every 5 minutes and has an allowed lateness of 1 min, Flink will create a new window for the interval between 12:00 and 12:05 when the first element with a timestamp that falls into this interval arrives, and it will remove it when the watermark passes the 12:06 timestamp.
In addition, each window will have a Trigger (see Triggers) and a function (ProcessWindowFunction, ReduceFunction, AggregateFunction or FoldFunction) (see Window Functions) attached to it. The function will contain the computation to be applied to the contents of the window, while the Trigger specifies the conditions under which the window is considered ready for the function to be applied. A triggering policy might be something like “when the number of elements in the window is more than 4”, or “when the watermark passes the end of the window”. A trigger can also decide to purge a window’s contents any time between its creation and removal. Purging in this case only refers to the elements in the window, and not the window metadata. This means that new data can still be added to that window.
Apart from the above, you can specify an Evictor (see Evictors) which will be able to remove elements from the window after the trigger fires and before and/or after the function is applied.[https://ci.apache.org/projects/flink/flink-docs-release-1.5/dev/stream/operators/windows.html#window-lifecycle]
3. 官方文档推荐的方式为Getting late data as a side output,可以单独获得再次被激活的窗口流https://ci.apache.org/projects/flink/flink-docs-release-1.5/dev/stream/operators/windows.html#getting-late-data-as-a-side-output
目前不确定原始流内是否也包含了再次被激活的窗口数据,待测试,从代码上看应该也包含在内。
已确认,原始流内窗口也会被重新激活一次
final OutputTag lateOutputTag = new OutputTag("late-data"){};