cutoff time and training window at featuretools

Suppose I have two datasets (corresponding to two entities in my entityset): First one: customers (cust_id, name, birthdate, customer_since)Second one: bookings (booking_id, service, chargeamount, booking_date)Now I want to create a dataset with features built from all customers (no matter since when they are customer) but only bookings from the last two years.How do I have to use the "last_time_index"? Can I set a "last_time_index" only to one entity? In this case only for the bookings entity, because I want ALL customers, but not all bookings...Read more

featuretools - LookupError: Time index not found in dataframe

Here is the code to reproduce this issue, but it can be avoided by removing "orders" entity.import featuretools as ftimport pandas as pdimport numpy as npdf = pd.DataFrame({'member_id': ['AAA', 'AAA', 'AAA', 'AAA', 'AAA', 'JJJ', 'JJJ', 'JJJ'], 'order_id': ['0001','0001','0001','0002','0002','1111','1111','1111'], 'order_datee': ['2011-01-01','2011-01-01','2011-01-01','2014-01-01','2014-01-01','2013-01-01','2013-01-01','2013-01-01'], 'member_join_datee': ['2011-01-01','2011-01-01','2011-01-...Read more

featuretools last_time_index is not set

I've built an entity set and one of the tables in this entity set is called "inspections". I've set the time_index column for this table, but when running dfs, I'm getting the warning "Using training_window but last_time_index is not set on entity inspections". The documentation shows that this should be set as a series: last_time_index (pd.Series) – Time index of the last event for each instance across all child entities.Can someone please provide an example of how and what values I should set the last_time_index to?Note, the calculations a...Read more

featuretools - Select amount of past data when calculating features

I'm wondering if there is a way to automatically select the amount of past data when calculating features.For example, I might want to predict when a customer is going to make their next purchase, so it would be good to know a count of purchases or average purchase price by different date cutoffs. e.g. Purchases in the last 12 months, last 3 months, 7 days etc.What is the best way to approach this with featuretools?...Read more

feature engineering - FeatureTools TypeError: unhashable type: 'set'

I'm trying this code for featuretools:features, feature_names = ft.dfs(entityset = es, target_entity = 'demo', agg_primitives = ['count', 'max', 'time_since_first', 'median', 'time_since_last', 'avg_time_between', 'sum', 'mean'], trans_primitives = ['is_weekend', 'year', 'week', 'divide_by_feature', 'percentile'])But I had this errorTypeError Traceback (most recent call last)<ipython-input-17-89e925ff895d> in <module> 3 agg_primitives = ['co...Read more

How can you calculate featurematrix in featuretools more specific to avoid long running?

I want to calculate a feature of second order (depth = 2). Because of the entity structure the feature matrix calculation need to calculate so many combination that the calculation takes "years".Can one more specify via a rule settings the list of features to be calculated? I have a Customer Table (cid, ...) with fixed data (i.e. birth date) related to each customer.Additional I have two (A and B) different tables (cid, MonthlyReportPeriod, ...) derived from customer monthly behavior (i.e. #orders) on different products.The wanted features of ...Read more

featuretools - When is the time_type set to NumericTimeIndex or DatetimeTimeIndex in the entityset?

I have problems calculating feature_matrix using cutoff_times table because of a type mismatch in cutoff times in the cutoff table and the time_type of the entityset. I am trying to understand the predict-next-purchase example using synthetic data.I got to the point of having cutoff_labels with 'datetime64[ns]' or pandas._libs.tslibs.timestamps.Timestamp type of cutoff time entries.The dfs procedure gave me an error message:cutoff_time times must be numeric: try casting via pd.to_numeric(cutoff_time['time'])I figured that the problems lies in t...Read more

featuretools - where can I find make_labels?

I do not find make_labelsI thought it would be part of the independent package utils. But I guess it was part of featuretools.utilsyou just have make_temporal_cutoffs instead. So how do you use that? Waht would be the translation of the example code:label_times = pd.concat([utils.make_labels(es=instacart_es, product_name = "Banana", cutoff_time = pd.Timestamp('March 15, 2015'), prediction_window = ft.Time...Read more

How do you define a custom primitive with parameters using the Featuretools package?

I'm trying to create a custom transformation using the Featuretools package where I can input a parameter and change the behaviour of the the functionFor example for the following custom log transformation class I wish to add a base parameter so I can do log transformations of features with different bases:class Log(TransformPrimitive): """Computes the logarithm for a numeric column.""" name = 'log' input_types = [Numeric] return_type = Numeric def get_function(self): return np.logHow would I go about implementing such a p...Read more

featuretools - How do you search for particular features?

At last when I tried featuretools I was searching for a particular feature which I was expecting. When you have > 30 feature it is kind of time consuming to find the feature. Has the feature_names object (second return object of the dfs method) a method to search for some text patterns (regex)?feature_names is a list of "featuretools.feature_base.feature_base.IdentityFeature"Post Scriptum: In the featuretools documentation of the API the return objects are not described...Read more

How do you detect or control dangerous usage of variables and their transformation using featuretools?

When you apply the transformation year, month, or day on a date of a purchase you could very easily run into a problem!Imagine your purchase is in 2018 but you want to apply a model on data of 2019. the Model is developed using features automatically generated by featuretools including simple transformation like year.The problem here is that the model does not know 2019!? Meaning the model is not general valid to new data....Read more

How do you implement a weighted sum transform primitive in Featuretools?

I'm trying to figure how to implement a weighted cum sum primitive for Featuretools. The weighting shall depend on time_since_last likecum_sum (amount) = sum_{i} exp( -a_{i} ) * amount_{i} where i are rolling 6 Month periods.... above you find the original question. after a while of try and error I came up with this code for my purpose:using the data and initial setup for entity and relation from here def weight_time_until(array, time): diff = pd.DatetimeIndex(array) - time s = np.floor(diff.days/365/0.5) aWidth = 9 ...Read more