Történeti magánéleti korpusz (TMK)/Old and Middle Hungarian Corpus of Informal Language Use (hdl:21.15109/ARP/KPFUAG)
(Ó- és középmagyar történeti magánéleti korpusz)

View:

Part 1: Document Description
Part 2: Study Description
Entire Codebook

(external link) (external link)

Document Description

Citation

Title:

Történeti magánéleti korpusz (TMK)/Old and Middle Hungarian Corpus of Informal Language Use

Identification Number:

hdl:21.15109/ARP/KPFUAG

Distributor:

ARP

Date of Distribution:

2025-12-11

Version:

1

Bibliographic Citation:

Gugan, Katalin; Dömötör Adrienne; Varga Mónika; Novák Attila, 2025, "Történeti magánéleti korpusz (TMK)/Old and Middle Hungarian Corpus of Informal Language Use", https://hdl.handle.net/21.15109/ARP/KPFUAG, ARP, V1

Study Description

Citation

Title:

Történeti magánéleti korpusz (TMK)/Old and Middle Hungarian Corpus of Informal Language Use

Alternative Title:

Ó- és középmagyar történeti magánéleti korpusz

Identification Number:

hdl:21.15109/ARP/KPFUAG

Authoring Entity:

Gugan, Katalin (ELTE Research Centre for Linguistics)

Dömötör Adrienne (ELTE Research Centre for Linguistics)

Varga Mónika (ELTE Research Centre for Linguistics)

Novák Attila (Pázmány Péter Catholic University Faculty of Information Technology and Bionics)

Distributor:

ARP

Access Authority:

Gugan, Katalin

Depositor:

Gugan, Katalin

Date of Deposit:

2025-12-11

Holdings Information:

https://hdl.handle.net/21.15109/ARP/KPFUAG

Study Scope

Keywords:

Social Sciences, Társadalomtudományok, annotált történeti korpusz, annotated historical corpus, ómagyar, Old Hungarian, középmagyar, Middle Hungarian

Abstract:

The Old and Middle Hungarian Corpus of Informal Language Use (TMK) contains text types assumed to be closest to Old and Middle Hungarian vernacular. It is built from private letters and records of legal proceedings dating before 1772, in roughly equal proportions. It primarily offers material for historical morphological and sociolinguistic research, but also can also be used to study historical syntax, pragmatics, and lexicology. Its extent is approximately 8.6 million characters (Hungarian part: 7.68 million characters, 1 million 112 thousand analyzed text words).

Methodology and Processing

Sources Statement

Data Access

Other Study Description Materials

Related Publications

Citation

Title:

Attila Novák, Katalin Gugán, Mónika Varga, Adrienne Dömötör: Creation of an annotated corpus of Old and Middle Hungarian court records and private correspondence. Language Resources and Evaluation 52 (2018): pp. 1–28

Identification Number:

10.1007/s10579-017-9393-8

Bibliographic Citation:

Attila Novák, Katalin Gugán, Mónika Varga, Adrienne Dömötör: Creation of an annotated corpus of Old and Middle Hungarian court records and private correspondence. Language Resources and Evaluation 52 (2018): pp. 1–28

Citation

Title:

Dömötör Adrienne, Gugán Katalin, Novák Attila, Varga Mónika: Kiútkeresés a morfológiai labirintusból – korpuszépítés ó- és középmagyar kori magánéleti szövegekből. NyK. 113 (2017): 85–110.

Identification Number:

10.15776/NYK.2017.113.3

Bibliographic Citation:

Dömötör Adrienne, Gugán Katalin, Novák Attila, Varga Mónika: Kiútkeresés a morfológiai labirintusból – korpuszépítés ó- és középmagyar kori magánéleti szövegekből. NyK. 113 (2017): 85–110.