Brian Benedict | LinkedIn
The newer generation (and, in some cases, lower-priced) products support different models, but perhaps with a narrower range of algorithmic sophistication. The model inventory in Alteryx Analytics Gallery includes such capabilities as regression analysis, decision trees, association rule analysis, classification and time series analysis. KNIME includes methods for text mining, image mining and time series analysis, and also integrates machine learning algorithms from other open source projects, Weka R and JFreeChart.
There are multiple facets of the scope of the data to be analyzed, including the issue of structured vs. unstructured information as well as access to conventional on-premises databases and data warehouses, cloud-based data sources, and data managed in big data platforms such as Hadoop. However, there are varying degrees of support for data managed within less-conventional data lakes (either managed within Hadoop, or in other NoSQL data management systems intended to provide horizontal scaling). The factors for distinguishing among the products must be based on your organization's specific requirements for accessing and processing data volumes and data variety.
Vendors can be compared in terms of their size. One might contrast what could be referred to as the mega-vendors whose big data analytics tools are just one set of products among a massive portfolio of tools. If you work for a larger organization that typically negotiates site-wide, enterprise licenses for the full suite of a vendor's tools, a mega-vendor such as such as IBM, SAS, SAP or Oracle may be a reasonable choice.