Introduction: Lately I've been looking into different machine learning methods to work around different business problems. By now I have a good, basic understanding of most regression and classification methods, and I'm able to use these methods to predict numeric values given other numeric values and/or simple categories (e.g. an employee's salary given age, years of experience and level of education) or a binary classification (e.g. will this employee leave the company based on the same variables).
What I'm looking for: However, I haven't found the right method for the problem I initially wanted to solve, which involves predicting a non-numeric, non-binary value from a mix of numeric and categorical data. I'm not looking for an in-depth explanation of how to solve the exact problem, but merely advise on which techniques/methods to look into. Ideally something that could be done with R.
The business problem: I have historical data on public tenders (i.e. public sector instutions buying goods/services from private contractors through calls for tenders). The data includes variables like:
- Orderer - i.e. who announced the tender (1 of ~150 municipalities/state insitutions)
- Type of procurement (1 or more of thousands of industrial classification codes)
- Estimated value of contract - A numeric value estimating the value of the contract (at a point before the winner is chosen).
- Winner - i.e. which contractor won the tender (1 of ~2000 private