You are building a logistic regression model to predict whether a tax filer will be audited within the next two years. Your training set population is 1000 filers. The audit rate in your training data is 4.2%. What is the sum of the probabilities that the model assigns to all the filers in your training set that have been audited?
Answer : A
You are using the Apriori algorithm to determine the likelihood that a person who owns a home has a good credit score. You have determined that the confidence for the rules used in the algorithm is > 75%. You calculate lift = 1.011 for the rule, "People with good credit are homeowners". What can you determine from the lift calculation?
Answer : C
In a Student's t-test, what is the meaning of the p-value?
Answer : A
In the MapReduce framework, what is the purpose of the Map Function?
Answer : A
What is a core deliverable at the end of the analytic project?
Answer : C
Consider a database with 4 transactions:
Transaction 1: {cheese, bread, milk}
Transaction 2: {soda, bread, milk}
Transaction 3: {cheese, bread}
Transaction 4: {cheese, soda, juice}
The minimum support is 25%. Which rule has a confidence equal to 50%?
Answer : A
Which data asset is an example of semi-structured data?
Answer : A
Refer to the exhibit.
Answer : A
In linear regression modeling, which action can be taken to improve the linearity of the relationship between the dependent and independent variables?
Answer : A
A Data Scientist is assigned to build a model from a reporting data warehouse. The warehouse contains data collected from many sources and transformed through a complex, multi-stage ETL process. What is a concern the data scientist should have about the data?
Answer : A
The average purchase size from your online sales site is $17, 200. The customer experience team believes a certain adjustment of the website will increase sales. A pilot study on a few hundred customers showed an increase in average purchase size of $1.47, with a significance level of p=0.1.
The team runs a larger study, of a few thousand customers. The second study shows an increased average purchase size of $0.74, with a significance level of 0.03. What is your assessment of this study?
Answer : A
You have been assigned to run a logistic regression model for each of 100 countries, and all the data is currently stored in a PostgreSQL database. Which tool/library would you use to produce these models with the least effort?
Answer : A
Refer to the exhibit.
Answer : A
Consider a database with 4 transactions:
Transaction 1: {cheese, bread, milk}
Transaction 2: {soda, bread, milk}
Transaction 3: {cheese, bread}
Transaction 4: {cheese, soda, juice}
You decide to run the association rules algorithm where minimum support is 50%. Which rule has a confidence at least 50%?
Answer : A
Refer to the Exhibit.
Answer : A
Have any questions or issues ? Please dont hesitate to contact us