Aleksey Zinoviev created IGNITE-12079:
-----------------------------------------
Summary: [ML][Umbrella] Add advanced preprocessing techniques
Key: IGNITE-12079
URL:
https://issues.apache.org/jira/browse/IGNITE-12079 Project: Ignite
Issue Type: New Feature
Components: ml
Affects Versions: 2.8
Reporter: Aleksey Zinoviev
Assignee: Aleksey Zinoviev
Fix For: 2.8
*Main goal:*
To reduce the gap between Apache Spark and Apache Ignite in preprocessing operations. The reducing of the gap could help with loading Spark ML Pipelines to Ignite ML.
Next steps:
# Add Frequency Encoder
# Add two Imputing Strategies (MIN, MAX, COUNT, MOST_FREQUENT, LEAST_FREQUENT)
# Add RobustScaler (will be added in Spark 3.0)
# Add CountVectorizer
# Add FeatureHasher
# Add QuantileDiscretizer
# Add Locality Sensitive Hashing (LSH)
# Add LabelEncoder
# Add RevertStringIndexing
# Add multi-column preprocessor
--
This message was sent by Atlassian JIRA
(v7.6.14#76016)