12/12/2022 0 Comments Kaggle competition spelling corrector![]() ![]() csv file and clean the text (used preprocessor package and manually replaced certain characters) csv filesįor the C parameter I do not understand it too much and if it is the reason my score is this high, please let me know and I appreciate any advice in finding a good value for it. #Kaggle competition spelling corrector code#In code set path to be the path to the.Rename test_with_solutions.csv to test.csv.From the Kaggle link download the train.csv and test_with_solutions.csv.Print_auroc_for_classifier(test_tuple, logreg) Logreg.fit(train_tuple.toarray(), train_tuple) Logreg = linear_model.LogisticRegression(C=7) Test_tuple = vectorize_dataset(' est', True, vectorizer) ![]() Train_tuple = vectorize_dataset(' rain', True, vectorizer) Plt.title('Receiver operating characteristic example') Y_score.append(classifier.predict_proba(sample))įpr, tpr, thresholds = roc_curve(y_true, y_score) Return (ansform(comments), labels)ĭef print_auroc_for_classifier(vect_tuple, classifier):įor sample, label in zip(vect_tuple, vect_tuple): Return (vectorizer.fit_transform(comments), labels) somehow overfitting or bug in code)? import csvįrom sklearn.feature_extraction.text import TfidfVectorizerįrom import SnowballStemmerĭef vectorize_dataset(subpath, stem, vectorizer): Could someone tell me if my program is doing something incorrectly that gives such a score (Ex. My area under the curve was 0.89 which would have placed me in 1st place with a significant lead and this seems quite impossible to me considering my implementation's simplicity. I cleaned the test/train data and used it to generate a ROC curve. In this case I used scikit-learn's logreg. I'm still fairly new to the classification problems so I just tested classifiers without too much modifications. I was researching a Kaggle competition and used a Logistic Regression classifier to test the top 10 competitiors' approaches. ![]()
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |