Standard Tiling Jobs

The following sections describe advanced properties available for standard tiling jobs executed with the CSVBinner.

Source Data Format

The oculus.binning.parsing.<field>.fieldType property indicates the type of values stored in a column that the CSVBinner will parse. Possible types include:

Value Description
double Real, double-precision floating-point numbers. This is the default type.
constant 0.0. The column does not need to exist.
zero
int Integers
long Double-precision integers
date Dates parsed and transformed into milliseconds since the standard Java start date (using SimpleDateFormatter). Expects the format yyMMddHHmm. Override the default format using the oculus.binning.parsing.<field>.dateFormat property.
boolean Boolean values (e.g., true/false, yes/no)
byte Bytes
short Short integers
float Floating-point numbers
ipv4 Contains an IP address treated as a four-digit base 256 number turned into an array of four bytes
string String value
propertyMap Property maps. Requires the presence of an additional set of propertyMap fields.

dateFormat

The following property overrides the default date format (yyMMddHHmm) expected when you specify a field type to be date.

Property Description
oculus.binning.parsing.<field>.dateFormat Specifies the date format expected by the corresponding field (e.g., yyyy-MM-dd HH:mm:ss).

propertyMap

The following properties are required when you specify a field type to be propertyMap:

Property Description
oculus.binning.parsing.<field>.property Name of the property
oculus.binning.parsing.<field>.propertyType Equivalent to fieldType
oculus.binning.parsing.<field>.propertySeparator Character or string used to separate properties
oculus.binning.parsing.<field>.propertyValueSeparator Character or string used to separate property keys from their values

For example, consider the following propertyMap setup:

oculus.binning.parsing.&lt;field&gt;.property=name
oculus.binning.parsing.&lt;field&gt;.propertyType=double
oculus.binning.parsing.&lt;field&gt;.propertyTypeSeparator=;
oculus.binning.parsing.&lt;field&gt;.propertyValueSeparator=,

In this case, a field value of id=123;name=foo;description=bar would yield the value foo.

Field Scaling

The following properties enable you modify the values of a field before it is used for binning:

Property Description
oculus.binning.parsing.<field>.fieldScaling How field values should be scaled. The default leaves values as they are. Other possibilities are:
  • log: take the log of the value. The base of the logarithm is taken from oculus.binning.parsing.<field>.fieldBase.
oculus.binning.parsing.<field>.fieldBase Base of the logarithm used to scale field values.
oculus.binning.parsing.<field>.fieldAggregation Method of aggregation used on values of field. Describes how values from multiple data points in the same bin should be aggregated together to create a single value for the bin.

The default is addition. Other possible aggregation types are:

  • min: find the minimum value
  • max: find the maximum value
  • log: treat the number as a logarithmic value; aggregation of a and b is log_base(base^a+base^b). Base is taken from property oculus.binning.parsing.<field>.fieldBase, and defaults to e.

Consolidation Partitions

The oculus.binning.consolidationPartitions property controls the number of partitions into which data is consolidated when binning.

Property Description
oculus.binning.consolidationPartitions The number of partitions into which to consolidate data when binning. If not included, Spark automatically selects the number of partitions.