However, the general character of the data produce a shape similar to that of the exponential model used for the trendline. For that reason, it would be prudent to avoid that approach. You'll see that consideration of the entire full curve (the blue trendline) can lead to substantial errors relative to known points. The Option 1 full data set trendline is shown as a dotted blue curve. Those returned by the Option 2 formula are open red circles. Then your missing "gap" point(s) are determined using the "a" and "b" for this local curve. The simplified version does not examine the entire data set, but only the immediate known points above and below the gap.and the model parameters "a" and "b" are determined for the curve that includes those two known data points. I have a simpler approach if you are satisfied that the general form of your data follow the shape given by the exponential model in post #5. Did you get this resolved? In your first post, where you mentioned a file with 300000 rows, does that mean you have search term rankings from 1 to approximately 300000, but you are missing the number of searches for some of them? In my previous post, I was originally thinking about using a conventional cubic spline interpolating polynomial, although with a very large data set that method would almost necessarily involve splitting up the data into manageable chunks, as the size of the matrix involved in determining the interpolating polynomials would exceed limits in most software packages.
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |