Posts by Collection

portfolio

Linguistic Evaluation of Machine-Generated ‘Real’ and ‘Fake’ News

Research on machine-generated fake news has often equated these two qualities, treating the task of identifying “machine-generated” news as equivalent to the task of identifying “fake news.” In this project, we create datasets of machine-generated “real news” and machine- generated “fake news” by using GPT-Neo 1.3B to perform text generation on input from the LIAR dataset including 12.8 K short stataments. Two main goals of the project: 1) to assess whether this approach is an effective way to create compa- rable machine-generated real and fake news, and 2) to ascertain if there are any detectable stylometric or linguistic differences between real and fake news generated in this way.

Designing a relational database system for the Self Sufficiency Standard: Representing the cost of living

The goal of this project was to develop a more efficient database workflow and structure to better support the community of researchers using the Self-Sufficiency Standard (SSS). To build the relational database SQLAlchemy and Python is used to hold the SSS for the 42 states in which the Standard has been calculated. The database includes a primary table with the SSS based on the family household type and several secondary tables, such as the cost of broadband and cellphone(s). The research also aimed to increase the transparency and accessibility of data for stakeholders with varying technical backgrounds through robust documentation.

Improving Baseline for GPT-2 Conversational Agent

Implemented Conversational Agent using Transfomer’s GPT-2 for text generation on MultiWOZ 2.2 dataset by fine-tuning decoding methods as Beam Search, Top-K Sampling, Temperature Sampling.

Using BERT to Measure Legalese and Formalness of Narratives in Traffic Ticket Dispute Resolution

Developed a method to measure formalness and legalese level of the narrative in traffic ticket dispute resolution data. Pseudo-perplexity is used to calculate the closeness of a sentence and pre-trained language models as BERT uncased and legal BERT models were applied to measure narrative’s legalese and formalness level of narratives.

Seq2Seq Polynomials

LSTM model that learns to expand single variable polynomials by taking the factorized sequence and predicting the expanded sequence

publications

Shifts in Family Businesses Due to the Covid-19 Pandemic

Published in Global Journal of Entrepreneurship, 1900

Recommended citation: Arik, M., Riley, J., Mirsaidova, A., Sumaiya, M. (2021). Shifts in Family Businesses Due to the Covid-19 Pandemic. Global Journal of Entrepreneurship, 153. https://www.igbr.org/wp-content/uploads/2021/07/GJE_Vol_5_SI_2021.pdf

Understanding How Students Want to Learn to Collaborate with a Collobaration Analytics System

Published in Artificial Intelligence in Education for Sustainable Society 2023., 1900

talks

teaching

MSAI 348: Introduction to Artificial Intelligence

Graduate course, Northwestern University, Computer Science, 2023

Conducted office hours engaging about 40 students into coursework covering topics as search, knowledge bases, machine learning, optimization, naive bayes, neural networks and others.

MBAI 417: Data and Data Intensive Systems

Graduate course, Northwestern University, Kellogg School of Management & Computer Science Department, 2023

Designed the course lab and homework assignments by applying ML and NLP techniques and mentored about 43 Kellogg MBAI students to learn SQL and Python fundamentals.

Aziza Mirsaidova

Posts by Collection

portfolio

publications

talks

teaching