
Slide

Centre Interdisciplinaire
de Recherche et d’Innovation
en Cybersécurité et Société
de Recherche et d’Innovation
en Cybersécurité et Société
1.
Ngouanfouo, C.; Davoust, A.
Detecting Machine-Generated Text using Grammatical Features Article d'actes
Dans: Proc. Int. Conf. Tools Artif. Intell. ICTAI, p. 843–848, IEEE Computer Society, 2025, ISBN: 10823409 (ISSN); 979-833154919-0 (ISBN).
Résumé | Liens | BibTeX | Étiquettes: AI Text Detection, CNN, Computational grammars, Detection methods, Language model, Machine-generated texts, Natural language generation, Natural language processing systems, Neural encoding, Neural modelling, Part Of Speech, Part-of Speech, Speech communication, Text detection, Written texts
@inproceedings{ngouanfouo_detecting_2025,
title = {Detecting Machine-Generated Text using Grammatical Features},
author = {C. Ngouanfouo and A. Davoust},
url = {https://www.scopus.com/inward/record.uri?eid=2-s2.0-105031903675&doi=10.1109%2FICTAI66417.2025.00123&partnerID=40&md5=5783b8797a3425f9dfa737343ee757d2},
doi = {10.1109/ICTAI66417.2025.00123},
isbn = {10823409 (ISSN); 979-833154919-0 (ISBN)},
year = {2025},
date = {2025-01-01},
booktitle = {Proc. Int. Conf. Tools Artif. Intell. ICTAI},
pages = {843–848},
publisher = {IEEE Computer Society},
abstract = {Large Language Models (LLMs) have advanced natural language generation but pose ethical and practical challenges, making it crucial to detect machine-generated texts. Traditional detection methods rely on complex, hard-to-interpret neural encodings and model-specific features like perplexity. This study explores whether grammatical patterns-specifically sequences of parts of speech (POS), including punctuation and symbols-can distinguish machine-written texts from human ones. Using a CNN classifier on POS sequences, the approach achieves nearly 90 % accuracy on a benchmark dataset. Combining POS-based features with neural embeddings improves performance, and the model shows robustness against adversarial attacks, though it is less effective on short texts. © 2025 IEEE.},
keywords = {AI Text Detection, CNN, Computational grammars, Detection methods, Language model, Machine-generated texts, Natural language generation, Natural language processing systems, Neural encoding, Neural modelling, Part Of Speech, Part-of Speech, Speech communication, Text detection, Written texts},
pubstate = {published},
tppubtype = {inproceedings}
}
Large Language Models (LLMs) have advanced natural language generation but pose ethical and practical challenges, making it crucial to detect machine-generated texts. Traditional detection methods rely on complex, hard-to-interpret neural encodings and model-specific features like perplexity. This study explores whether grammatical patterns-specifically sequences of parts of speech (POS), including punctuation and symbols-can distinguish machine-written texts from human ones. Using a CNN classifier on POS sequences, the approach achieves nearly 90 % accuracy on a benchmark dataset. Combining POS-based features with neural embeddings improves performance, and the model shows robustness against adversarial attacks, though it is less effective on short texts. © 2025 IEEE.



