|
View: |
Part 1: Document Description
|
|
Citation |
|
|---|---|
|
Title: |
Subject categorisation experiments with AI in MTMT |
|
Identification Number: |
hdl:21.15109/ARP/VWQFD2 |
|
Distributor: |
ARP |
|
Date of Distribution: |
2026-05-08 |
|
Version: |
2 |
|
Bibliographic Citation: |
Micsik, András; Tanácsi, Roland, 2026, "Subject categorisation experiments with AI in MTMT", https://hdl.handle.net/21.15109/ARP/VWQFD2, ARP, V2 |
|
Citation |
|
|
Title: |
Subject categorisation experiments with AI in MTMT |
|
Identification Number: |
hdl:21.15109/ARP/VWQFD2 |
|
Authoring Entity: |
Micsik, András (HUN-REN SZTAKI) |
|
Tanácsi, Roland (HUN-REN SZTAKI) |
|
|
Date of Production: |
2025-11-15 |
|
Software used in Production: |
Python |
|
Grant Number: |
RRF-2.3.1-21-2022-00004 |
|
Distributor: |
ARP |
|
Access Authority: |
Micsik, András |
|
Depositor: |
Micsik, András |
|
Date of Deposit: |
2026-02-03 |
|
Holdings Information: |
https://hdl.handle.net/21.15109/ARP/VWQFD2 |
|
Study Scope |
|
|
Keywords: |
Computer and Information Science, Számítástechnika és informatika, subject classification, scientific categorization, transformer models, Support Vector Classifier, data cleaning, large language models |
|
Topic Classification: |
artificial intelligence |
|
Abstract: |
Code, sample data and results for subject categorisation experiments with AI in MTMT |
|
Methodology and Processing |
|
|
Sources Statement |
|
|
Data Access |
|
|
Notes: |
<a href="http://creativecommons.org/licenses/by-nc-nd/4.0">CC BY-NC-ND 4.0</a> |
|
Other Study Description Materials |
|
|
Related Publications |
|
|
Citation |
|
|
Title: |
Tanácsi, R., & Micsik, A. (2026). A Comparative Evaluation of AI Approaches to Large-Scale Scientific Subject Classification. Big Data and Cognitive Computing, 10(5), 151. |
|
Identification Number: |
10.3390/bdcc10050151 |
|
Bibliographic Citation: |
Tanácsi, R., & Micsik, A. (2026). A Comparative Evaluation of AI Approaches to Large-Scale Scientific Subject Classification. Big Data and Cognitive Computing, 10(5), 151. |
|
Label: |
README.txt |
|
Notes: |
text/plain |
|
Label: |
lvl4-mtmt-large-multiclass-svm-rbf.zip |
|
Notes: |
application/octet-stream |
|
Label: |
svm_rbf_confusion_matrix_percent.csv |
|
Notes: |
text/csv |
|
Label: |
annif.csv |
|
Notes: |
text/csv |
|
Label: |
embedding_scikit.csv |
|
Notes: |
text/csv |
|
Label: |
scibert_lvl3.csv |
|
Notes: |
text/csv |
|
Label: |
scibert_lvl4.csv |
|
Notes: |
text/csv |
|
Label: |
scibert_lvl4_subtopics.csv |
|
Notes: |
text/csv |
|
Label: |
scibert_moe.csv |
|
Notes: |
text/csv |
|
Label: |
frascati_mapping.json |
|
Notes: |
application/json |
|
Label: |
sample_evaluation_data.csv |
|
Notes: |
text/csv |
|
Label: |
sample_evaluation_data.npy |
|
Notes: |
application/octet-stream |
|
Label: |
sample_training_data.csv |
|
Notes: |
text/csv |
|
Label: |
sample_training_data.npy |
|
Notes: |
application/octet-stream |
|
Label: |
eval_svm_rbf.py |
|
Notes: |
text/x-python |
|
Label: |
requirements.txt |
|
Notes: |
text/plain |
|
Label: |
train_svm_rbf.py |
|
Notes: |
text/x-python |