Unfiltered: Adobe Analytics – Key takeaways from session #1 9 Oct 3:10 AM (15 days ago)

Adobe CJA usage; restoring shaken trust, securing buy-in, supporting data enablement, self-serve, and governance.

Our recent Unfiltered: Adobe Analytics session brought together leading professionals to share candid insights on major industry shifts, including the rise of Customer Journey Analytics (CJA), the future of Adobe Analytics, and a range of practical strategies to secure buy-in, improve governance, and build trust, and prove value across the business.

Below is a quick summary of 5 themes and strategies from the discussion that you can begin applying to your own analytics practice today:

1) CJA is promising but adoption remains limited

CJA promises a simple, unified view of journeys – but uptake in the UK appears low despite an aggressive push from Adobe. Reasons may include migration complexity (complex schemas and no simple route to switch), a perceived capability gap with CJA vs Adobe Analytics, reliance on the Edge Platform/AEP, and cost concerns.

More technical Adobe Analytics users (in particular) may see little to no reason to make the switch; those who are proficient in SQL and stitching and enriching data in platforms like Snowflake will need unique benefits, beyond what they can already deliver.

2) Vendor trust matters – Have recent public mix-ups shaken confidence?

Recent Adobe Edge routing issues (publicly acknowledged by Adobe within their status updates) may have dented the confidence of certain users.

In the case of CJA uptake – many organisations might now demand even clearer, safer migration paths; demonstrable business cases that showcase ROI in terms of risk versus reward; and reliability assurances before committing any further.

3) Mastering buy-in: Creating business cases with clear user benefits and measurable hooks

To avoid “new toy” syndrome, frame the adoption of new tools and builds around tangible benefits: faster insights, broader access (mobile or low-friction delivery), and personal measures senior leaders care about (their specific ability to access a specific dashboard on-the-go, for example).

Secondly – taking the initiative to share small, surprising insights can unlock effective proof of value. Build the case for further investment by highlighting what works well already – inspire curiosity around the business, challenge preconceived ideas, and showcase the potential for more.

4) Supporting enablement and self-serve

Self-serve continues to be an ongoing aspiration for many teams, but it only really succeeds with ongoing enablement.

Establish internal training – Set up regular drop-in sessions or personalised sessions for specific departments or roles to help inspire the wider business overcome perceived learning curves and improve data literacy. This comes with the additional benefit of fostering stronger internal relationships and creating important feedback loops.

Top Tip: The goal isn’t just to complete the training; it’s to motivate consistent application beyond that first session. By providing short, personalised takeaway exercises you can get users hooked on applying their new skills and forming good measurement habits.

5) Building a foundation of trust with governance and strong communication

Adobe Analytics can be a trickier platform to use than some teams may be used to. Simplify interpretation and help end users quickly get over the initial ‘hump’ using custom dashboards, pre-selected lists of dimensions or metrics, and Workspaces shipped with their essentials, so users understand what they’re seeing.

Acknowledge and communicate all known issues within your implementation (like outdated eVars, mislabelling, etc) – by informing users about what to ignore, you instil trust. In our experience, little quirks in an implementation can be the source of organisation-wide confusion. Custom documentation or labelling may be helpful here.

Remember that experimentation and personalisation initiatives need discipline and explainable context to thrive. Encourage teams to respect testing timelines, limit premature access to live results, and convert inconclusive outcomes into actionable learnings that inform follow-up tests. By owning and reframing the failures, such as an inconclusive A/B test, you can build transparency and credibility for your team, instead of sweeping findings under the rug.

What’s next for you analytics strategy?

The session inspired unique viewpoints of real-world strategies to drive measurement in Adobe Analytics, with respect to business complexities and complexities around the tool itself. The session made clear that technology alone won’t drive adoption – clarity of use cases, simple processes, clear governance and compelling business benefits will.

We welcome you to join us at our next event to connect with peers, share your challenges, and discover how to drive analytics forward by surfacing and addressing these common pain points head-on.

Secure your spot at the next Unfiltered: Adobe Analytics event (5th of November)* to have your say in shaping the agenda, receive a full breakdown of the dialogue, and ensure your analytics strategy using Adobe Analytics is future-proof.

*Already registered for the group? Don’t worry – your invitation should be in your inbox.

Want personalised support?

Our team at Lynchpin turns scattered data into clear insight so leaders can act fast. Independent since 2005, our experts are here to help with all of your data and analytics challenges. We have deep vendor-neutral skills across the full analytics stack – supporting leading organisations such as Channel 5, Canon, Hotel Chocolat, John Lewis Finance, AbbVie, and more, turning data into revenue lift, lower acquisition costs, and happier customers.

Speak to us today. Our team are always happy to help!

The post Unfiltered: Adobe Analytics – Key takeaways from session #1 appeared first on Lynchpin.

20 Years of Analytics: What Does the Data Say? 5 Sep 1:29 AM (last month)

2025 marks 20 years since Lynchpin started trading, and 25 years for me personally working in the field of data and analytics (isn’t it nice when the numbers line up so beautifully!).

There’s been plenty of exciting developments and change over those decades, but also some unbending truths that haven’t really shifted an inch. In the context of the latest AI wave, it’s well timed to reflect on some of the key shifts and anchors over the past couple of decades, and how they might project forward based on that experience.

Web Analytics Goes Full Circle

In 2000, web analytics was all about counting – vanity metrics were popular with Venture Capitalists – and the ability to track end-to-end marketing performance was rarely even articulated as a need.

The gold standard of “how many visitors?” was key to pumping up valuations, but of course nobody could agree how “a visitor” should be defined. The Audit Bureau of Circulation – who were more used to auditing newspaper circulation figures – stepped in to provide an industry standard that could be calculated in 3 completely different ways, and my first foray into web analytics was writing Perl scripts that would churn through web server logs at an early dot com business to give the magic audited number that would keep the investors happy.

In the early noughties, the web analytics industry was exploding with platforms that put not very useful and not very performant front ends on top of clickstream data. And then that market consolidated in what felt like a heartbeat when Google released a free analytics tool in 2006. But to answer any real business questions, the only real solution was still to stick the raw data into a database and start to build your own data model.

At Lynchpin we bit the bullet early and literally racked up database servers in a datacentre (this was pre-cloud) to answer those key questions: what was the real customer journey across display and search (in 2008), how did publishing recency and frequency affect search visibility and hence reach for newspapers globally (in 2009), how could you model the long term impact of content on long term customer acquisition (in 2014).

We also learned early on that to effect actual change you needed to give people tools, not just reports, that they could use in their day-to-day decision making. We were super early adopters of Tableau back in 2008 (Power BI didn’t appear until 2015) to arm our clients with role-focused tools that gave them data-driven recommendations and scenario planning aligned with the levers they could pull around trading, merchandising, pricing and budget allocation.

Fast forward to today, and it does make me smile a little to see the de rigueur model for enterprise digital analytics is converging on just sticking the raw clickstream data into a data warehouse and taking it from there – accelerated not least by Google Analytics 4 being arguably a step back in terms of user interface coupled with a convenient free stream into BigQuery.

And with the shifts in consent for digital measurement and the stealthy introduction of modelled data into the platforms, the old question of “what exactly is a visitor” still feels very relevant.

So, a complete full circle? Not quite. What has changed is the accessibility and affordability of compute (which is a recurring theme throughout this reflection): a small cloud processing bill rather than a rack of servers as the entry point for really getting stuck into digital data. What hasn’t changed is the importance of getting the architecture and data model right and the challenge of successfully aligning the outputs with the business outcomes.

Measuring Behavioural Shifts

It’s very human to assume the rest of the planet thinks and behaves as we do, and a consistent danger point in analysis has always been assuming the average represents everyone and then missing the more subtle and slowly shifting tides of different underlying segments.

Throughout the twenty-tens there were at least 5 “This is The Year of Mobile!” pronouncements I can recall; in reality, none of those years were “the year of mobile” and every year was “a year of slightly more mobile”. The percentage shares slowly shifted, at different rates in different demographics, and at the same time the internet started splitting into Apple/Safari/iOS “privacy first” versus Google/Android/Chrome “advertising first” worlds.

I suspect a similar thing is about to happen with shifts from keyword search to prompts to agentic delegation, set against potential cravings for more direct engagement with “authentic” humans. I’d certainly be surprised if it’s a uniform transition across all audiences in all contexts.

Just like the “year(s) of mobile”, the businesses that measure and understand the nuances of these shifts in terms of their customers will be the ones that will successfully tap into the opportunities and manage the threats. And each new mode of engagement drives an important challenge to address in terms of how to measure that engagement in the first place.

Follow The Money

Or at least watch it. Flows of capital determine rates of progress but also define (sometimes inflated) expectations of value that need to be realised to give a return on that capital.

When Uber was doing its initial market entry, a friend used to cheerily remind me that for each unfathomably cheap ride a generous VC had basically put their hands directly into their pockets to pay a chunk. It was all subsidised to grab the market… until it wasn’t anymore.

Similarly, every new wave of technology progress is heavily subsidised… until it isn’t anymore and the investors that won the market share race want the return for their risk: cloud compute, data lakehouses, customer data platforms, large language models.

A comment that stuck with me from an early debate about the relative merits of the different cloud providers was the observation that all of them were ultimately just reselling electricity packaged up into different units of consumption.

That comment feels even more relevant now that the cloud providers are literally building their own power stations to fuel AI usage, and the question of the sustainability of our accelerating compute addiction is an increasingly real environmental issue. Quantum computing is genuinely exciting because of its potential to rewrite the compute vs electricity relationship, but given how long term and uncertain the pay-off, will the capital flows deliver it in time?

Regulatory Lag

In contrast to the capital flows that seek to define the future, regulation has often struggled to shut doors that have been blowing wide open based on the rate of change and legislative lag. Reflecting on the past 20 years it’s often felt like we’ve been doubling down on trying to fix the last but one issue in data privacy while newer and much bigger risks run unchecked.

To start on a more positive note, I’d venture GDPR has stood the test of time as actually being a decent piece of legislation. Mainly because the principles of GDPR all align well with good principles of data strategy – define your use cases and process data with clear purpose.

By contrast, the legislative failure to properly integrate e-Privacy and GDPR in Europe has been a slow-motion car crash of incomprehensible cookie consent dialogs. I don’t think when Tim Berners Lee invented the world wide web his vision was that everyone would be reviewing and accepting a new set of legal terms and conditions for every hyperlink they followed. The reality is that cookie consent should have been a browser standard and the industry failing to pull together on that or put aside some of their vested interests destroyed the user experience.

Meanwhile, other potentially far more intrusive data collection (mobile apps and location spring to mind) ran relatively unchecked, and we ended up in the weird situation where Apple became the self-appointed global regulator of consumer privacy on their own devices.

I can’t see AI regulation being anything other than a horse and stable door type situation in the current geopolitical climate. Which arguably pushes the responsibility back on us as an industry to wrestle with the challenges and lead more effectively in terms of protecting consumer data and business knowledge.

Waves of AI

Around 2018, analytics had a big AI Wave centred around machine learning, and suddenly every analytics and marketing platform was “AI Enabled”. What was new? Nothing much in fact: most of the algorithms being deployed (e.g. to do some basic segmentation or forecasting) had been around for decades, and some were simply rebranded centuries old maths and nothing to do with an AI revolution at all.

The real AI wave around machine learning eased in more subtly from that point, fuelled by the availability of compute, and specifically the proliferation of cloud and open source. As soon as well-maintained machine learning toolkits were openly available as Python packages that could be run in the cloud at negligible cost, the previous platform barriers to entry (expensive servers and licenses) started to evaporate.

With those platform barriers to entry swept away, the increasing gap was practical skills and experience –moving models beyond proof of concept and into production and integrating with the business process. Interestingly, many of the machine learning models we successfully deploy for aspects such as pricing stop getting called AI by the client as soon as they are operational: when it works it’s just “automation” or “optimisation”.

So, does that previous wave of AI help to anticipate the impact of the current wave around generative AI on data and analytics? Certainly the availability of compute is again spearheading progress, and aspects of open source (or at least open model weights) are fuelling the competitive bleeding edge.

Interfaces are Key

What is different this time is the accessibility of the human interface – in theory the final technical barrier to entry swept away. In a world where anyone can prompt to an apparently unrestricted range of outcomes, where does data, knowledge and experience fit in?

Analysts need accessible, timely and well-structured data that they can trust to base analysis on, and models and agents are no different. And lots of businesses still struggle to arm their human analysts with the right data at the right level of detail to do their jobs effectively.

RAG (Retrieval Augmented Generation) allows LLMs to tap into live first party data (e.g. live transactions) but can only be as good as that first party data in terms of quality and availability.

There are also opportunities to turn RAG on its head and use LLM research to feed closed loop machine learning models with additional research context (e.g. new features for forecasting), if there is a good approach to ongoing testing and optimisation.

Finally, emerging (albeit competing) standards like Model Control Protocols (MCPs) give broader interfaces for LLMs interacting with other systems, but in an enterprise need very well-defined governance to manage risks and avoid data breaches.

These behind-the-scenes interfaces are going to be the critical success factors for enterprise AI implementations. I believe this is the biggest built-in hedge in data and analytics right now: humans and machines need to be fed with well-structured and trusted data, and the data strategy fundamentals are very similar irrespective of the consumer.

This latest AI wave just means the businesses that aren’t on top of their data will fall behind much more quickly.

Context is Value

The most important and underrated analyst skill has always been the ability to interpret data in the context of a business and all its nuances: the real bridge between insight and execution and outcomes. But context requires the bringing together of knowledge, and interpretation requires critical thinking.

Can GenAI bridge, or help to bridge, that context and critical thinking gap? Critically, models can only feed off the data made available to them, and when every vendor is sticking a GenAI layer (and cost) onto their platform there is a real risk that valuable proprietary context gets siloed within those platforms.

And because capital flows determine that only a very limited number of players can win the race to train the best overall model, ultimately a limited number of models will end up licensed to other vendors to repackage, further homogenising those layers of siloed context.

It’s AI bloat, and eventually businesses will need to consider the multiplying cost impact. Or, more importantly, unless businesses own and consolidate their own proprietary context that leads to executing on those platforms with a competitive edge (based on better data and hence knowledge), that bloat is simply a race to the bottom in terms of performance rather than an investment.

So, while the current AI wave might be focused on “AI everywhere”, the tenants of data strategy and what makes proprietary insight valuable are not going anywhere. Context is value, and businesses will ultimately need to train models on their own consolidated data to be competitive and prosper.

The Next 20 Years

I wonder how many AI waves there will be over the next 20 years. Will we notice them as they pass, or will it be the more subtle shifts underneath those waves that turn out to be more impactful when we look back?

For those of us that had the fun of working through the dot com boom and bust of the early noughties, it’s hard not to see at least some parallels with the investment patterns and short-term expectations right now. But it’s also an important reminder of how the impact of a technology revolution can sweep in beneath the initial hype once that hype has subsided and turn entire industries on their heads.

Ultimately, interfaces are key, and context is value in data and analytics. I believe this will continue to define the relationship between smart humans, smart technology and real business outcomes in this industry.

View more from Lynchpin

UPCOMING EVENTSCAPABILITIESSERVICESCONTACT US

The post 20 Years of Analytics: What Does the Data Say? appeared first on Lynchpin.

(Video) How to customise your Report library in GA4 20 Jun 9:01 AM (4 months ago)

In the latest episode of our ‘Calling Kevin’ video series, we show you how to customise your GA4 report library – updating your Google Analytics reporting interface to include a new, personalised collection of reports.

Follow these quick and easy steps to begin tailoring your GA4 report library menu and navigation for a more efficient reporting experience. We also share a helpful refresher on how to work with topics, templates, plus more!

For more quick GA4 tips, be sure to check out other videos from our ‘Calling Kevin’ series.

To find out how Lynchpin can help

CONTACT USVIEW OUR GOOGLE ANALYTICS SUPPORT SERVICES

The post (Video) How to customise your Report library in GA4 appeared first on Lynchpin.

Data pipelines using Luigi – Strengths, weaknesses, and some tops tips for getting started 1 May 8:38 AM (5 months ago)

At Lynchpin, we’ve spent considerable time testing and deploying Luigi in production to orchestrate some of the data pipelines we build and manage for clients. Luigi’s flexibility and transparency make it well suited to a range of complex business requirements and to seamlessly support more collaborative ways of working.

This blog post draws from our hands-on experience with the tool – stress-tested to really understand how it performs day to day in real-world contexts. We will begin by walking through what Luigi is, why we like it, where it could be improved, and share some practical tips for those considering using it to enhance their data orchestration processes and data pipeline capabilities.

What is Luigi and where does it sit in the data pipelines market?

Luigi is an open-source tool developed by Spotify that helps automate and orchestrate data pipelines. It allows users to define dependencies and build tasks with custom logic using Python, offering flexibility and a fairly low barrier to entry for its quite complex functionality.

Despite Spotify’s introduction of a newer orchestration tool, Flyte, Luigi is still widely used by many major brands and continues to receive updates – allowing it to continually mature and become a reliable choice for a range of data orchestration use cases.

Luigi sits amongst many popular tools used for data orchestration in the data engineering space – some of which are paid, while others are similarly open source.

Another tool we’ve used for data orchestration is Jenkins. Although it isn’t designed for more heavy-duty pipelines, we’ve found it to work very well as a lightweight orchestrator, managing tasks and dependencies.

In the following section, we’ll break down some benefits of using Luigi for your data pipelines and a few reasons why you may choose it over a comparable tool such as Jenkins.

What we like

Transparent version control:

One of the key advantages of Luigi is that it’s written in Python. This gives you transparent version control over your data pipelines – every change is committed and traceable: you know exactly what change has been made, you can inspect it, you can see who did it, and when it was done. This becomes even more powerful when linked to a CI/CD pipeline, which we do for some of our clients, as this means that any change to the pipeline in the repository is automatically the truth.

With Jenkins, for example, changes can be made and it’s not necessarily obvious what was changed or by which team member (unless explicitly communicated) – which becomes increasingly important when you’re managing more complex data pipelines with many moving parts and dependencies.

Dependency handling and custom logic capabilities:

Managing data pipeline dependencies is where Luigi truly stands out. In a tool like Jenkins, downstream tasks can be orchestrated but this often requires careful scheduling or wrapper jobs, which can get complicated and quite manual as a process depending on the complexity of your needs. Luigi simplifies this and enables smoother levels of automation by allowing you to define all dependencies directly in Python, allowing for logic such as: ‘Run a particular job only after a pipeline completes, and only do this on a Sunday or if it failed the previous Sunday.’

This level of custom logic is trivial in Python but can be difficult to replicate in Jenkins, where perhaps the only option is to run on a Sunday without any conditions surrounding it.

Pipeline failure handling:

Luigi considers all tasks idempotent. Once a task has run, it’s marked as ‘done’ and won’t be re-run unless you manually remove its output. This is a particularly useful feature if you have big, complex pipelines and only need to re-run certain jobs that have failed. You won’t need to re-run everything, but can find the failed task, delete its output file, and save time when re-executing the job.

Backfilling at the point of a task:

Luigi handles backfilling easily by allowing users to pass parameters directly into tasks.

This allows you to retrieve historical data (for example, backfilling from the beginning of last year to present) without having to change the script or config files.

Luigi will treat tasks with parameters like new tasks, so if the job had previously run, it’ll recognise the changed parameters and simply pass those parameters through.

Efficiency to set up, host, and use alongside existing infrastructure:

While tools such as Apache Airflow may require a Kubernetes cluster (and more) to begin running, Luigi, by contrast, is far simpler to host. You can run it on a basic VM (Virtual Machine) or through a tool like Google Cloud Platform, using a Cloud Run job. This makes it a great choice for smaller data pipelines or client-specific pipelines where you may want to decouple from the main infrastructure.

Market maturity and active use and development by many large brands:

Luigi is used by many users – including a host of major brands over the years, such as Squarespace, Skyscanner, Glossier, SeatGeek, Stripe, Hotels.com, and more. This is integral to its maintenance and viability as a good open-source tool. Its core functionality rarely changes, making it a stable and reliable choice for users; We found that any updates we’ve experienced are primarily focused on maintaining security rather than big rehauls to its functionality, which brings us to a few of its shortfalls…

What we don’t like

Limited frontend and UI:

Luigi’s frontend leaves a lot to be desired. Firstly, it only really shows you jobs that are running or have recently succeeded in running, so if you have many running jobs in one day, the History tab fails to give you a strong overview of information.

When something fails, you’ll be notified, and you can inspect logs in a location that you previously specify, however it would be nice if the frontend provided a good summary of this information instead.

Workarounds do exist, such as saving your task history (e.g., tasks that ran, the status, how long they took, etc) in a separate table (for example, Postgres) where it can be visualised in an external run dashboard – providing a more personalised frontend for better monitoring, visibility into run times, failure rates, and so on.

Setting something like this up would provide more feature parity with a tool such as Jenkins, which, by contrast, does a great job at providing stats and visual indicators for task history, job health, what’s running, and more – right out of the box.

Example of data pipelines built and managed using Jenkins.

Example of data pipelines built and managed using Luigi.

Documentation could be improved:

While Luigi provides all the key documentation you need, it’s not always the easiest to find or navigate – this, when compared to tools such as dbt, makes documentation as a whole feel sparse in places, especially when dealing with more advanced features or plugins.
For instance, helpful features such as enabling dependency diagrams or tracking task history involves installing separate modules, which is a process that isn’t particularly well-explained in their official documentation.

In many instances, users may find themselves gaining the most clarity about how the tool works by trying things out and learning as they go.

Python path issues – everything must be clear or else Luigi will struggle to find it:

To avoid a barrage of ‘module not found’ errors, Luigi will need to know exactly where everything lives in your environment.

A workaround we found useful is creating a Shell Script that sets out all necessary paths and everything Luigi may possibly need to run successfully.

While something like this may take a little time to set up, it’s a small level of upfront effort to improve your workflow in Luigi and avoid any issues in the longer run.

Our top tips for getting started: (Data pipelines using Luigi)

If one of your tasks fails, make sure you delete your output file before running it again. If Luigi registers an output file, it’ll automatically assume the task is done, and therefore skip it in the re-run, assuming it was completed successfully.
To make up for Luigi’s limited frontend, we think it’s worth your time to set up your own custom run dashboard to monitor tasks and compensate for a UI which falls short of its competitors and doesn’t provide a tidy and complete overview of tasks.
For a smooth and pain-free setup, we recommend using a Shell Script to handle Python paths and prevent any issues that may cause files from being easily located by Luigi.
Be prepared to dive in and get your hands dirty to really understand how things work in Luigi. Documentation is thin in places or sometimes hard to find when compared to other tools on the market, so you may find there is a bit of a learning curve or trial and error process to be aware of.

Conclusion:

We think Luigi is a powerful data orchestration tool for anyone comfortable with Python, who has experience managing data pipelines, and is comfortable getting to grips with a few of its quirks that may make onboarding a bit challenging.

If you’re looking for an alternative to tools like Apache Airflow or Jenkins, Luigi is definitely worth trying out. While we recognise that its UI and documentation are lacking when compared to other tools in this space, we found that Luigi’s version controlling, dependency handling, and logic capabilities make it a handy tool for a range of our clients’ use cases.

For more information on how we can support your organisation with data pipelines and data orchestration – including custom builds, pipeline management, debugging and testing, and optimisation services – please feel free to contact us or explore the links below

CONTACT USVIEW OUR CAPABILITIESVIEW OUR SERVICES

The post Data pipelines using Luigi – Strengths, weaknesses, and some tops tips for getting started appeared first on Lynchpin.

Automated testing: Developing a data testing tool using Cursor AI 11 Dec 2024 8:17 AM (10 months ago)

In this blog, we discuss the development of an automated testing project, using the AI and automation capabilities of Cursor to scale and enhance the robustness of our data testing services. We walk through project aims, key benefits, and considerations when leveraging automation for analytics testing.

Project Background:

On an ongoing basis, we upgrade a JavaScript library that we manage, to include a number of improvements and enhancements, which is deployed to numerous sites. The library integrates with different third-party web analytics tools and performs a number of data cleaning and manipulation actions. Once we upgrade the library, our main priorities are:

Feature testing: Verify new functionality across different sites/environments

Regression testing: Ensure existing functionality has not been negatively affected across different sites

To achieve this, we conduct a detailed testing review across different pages of the site. This involves performing specific user actions (such as page views, clicks, search, and other more exciting actions) and ensuring that the different events are triggered as expected. We capture network requests for outgoing vendors, such as Adobe Analytics or Google Analytics through the browser’s developer tools or a network debugging tool (e.g., Charles) and verify if the correct events are triggered and relevant parameters are captured accurately in the network requests. By ensuring that all events are tracked with the right data points, we can confirm that both new features and the existing setup are working as expected.

Project Aim:

To optimise this process and reduce the manual effort involved, we developed an automated testing tool designed to streamline and speed up data testing. As an overview, this tool automatically simulates user actions on different sites and different browsers, triggering the associated events, and then checks network requests to ensure that the expected events are fired, and the correct parameters are captured.

Automated Testing Benefits:

In the era of AI, automation is a key driver of efficiency and increased productivity. Automating testing processes offers several key benefits to our development and data testing capabilities, such as:

Reduces setup time and creating testing documentation: We’re able to run through different tests and scenarios with a one-time setup for each site and each version.
More accurate data testing: With a thought-out test plan which is followed precisely, we’re able to put more trust in our testing outcome. This helps us identify issues quicker.
Better test coverage: We can run tests on different browsers and devices, using the same setup.

How We Did It:

We chose Python as the primary scripting language, as it offers flexibility for handling complex tasks. Python’s versatility and extensive libraries made it an ideal choice for rapid development and iteration.

For simulating a variety of user interactions and conducting tests across multiple browsers, we selected Playwright. Playwright is a powerful open-source automation tool/API for browser automation. It supports cross-browser data testing (including Chrome, Safari, Firefox), allowing us to validate network requests across a broad range of environments.

We used the Cursor AI code editor to optimise the development process and quickly set up the tool. Cursor’s proprietary LLM, optimised for coding, enabled us to design and create scripts efficiently, accelerating development by streamlining the debugging and iteration process. Cursor’s AI-assistant (chat sidebar) boosted productivity by providing intelligent code suggestions, speeding up debugging and investigation. We’ll dive into our experience using Cursor a bit further in the next section

Lastly, we chose Flask to build the web interface where users can select different types of automated testing. Flask is a lightweight web framework for Python, which we’ve had experience with for other projects. It has its pros and cons, but a key benefit of this project was that it allowed us to get started quickly and focus more on the nuts and bolts of the program.

Our Experience with Cursor:

Cursor AI played a crucial role in taking this project from ideation to MVP. By carefully prompting Cursor’s in-editor AI assistant, we were able to achieve the results we wanted. The tool allowed us to focus on the core structure of the program and the logic of each test without getting bogged down in documentation and finicky syntax errors.

Cursor also gave us the capability to include specific files, documentation links, and diagrams as context for prompts. This allowed us to provide relevant information for the model to find a solution. Compared to an earlier version of Github’s copilot that we tested, we thought this was a clear benefit in leading the model to the most appropriate outcome.

Another useful benefit of Cursor AI was the automated code completion, which could identify bugs and propose fixes, as well as suggest code to add to the program. This feature was useful when it understood the outcome we were aiming for, which it did more often than not.

However, not everything was plain sailing, and our experience did reveal some drawbacks to using AI code editors to be mindful of. For example, relying too much on automated suggestions can distance yourself from the underlying code, making it harder to debug complex issues independently. It was important to review the suggested code and use Cursor’s helpful in-editor diffs to clearly outline the proposed changes. This also allowed us to accept or reject these changes, giving us a good level of control.

Another drawback we noticed is that AI-generated code may not always follow best practices or be optimised for performance, so it’s crucial to review and validate the output carefully. For example, Cursor tended to create monolithic scripts instead of separating functionality into components, such as tests and Flask-related parts, which would be easier to manage in the long term.

Another point we noticed was that over-reliance on AI tools could easily lead to complacency, potentially affecting our problem-solving skills and creativity as developers. When asking Cursor to make large changes to the codebase, it can be easy to just accept all changes and test if they worked without fully understanding the impact. When developing without AI assistance (like everyone did a couple of years ago), it’s better to make specific and relatively small changes at a time to reduce the risk of introducing breaking changes and to better understand the impact of each change. This seems to be a sensible approach when working with a tool like Cursor.

What We Achieved – Efficiencies Unlocked:

The automated testing tool we developed significantly streamlined and optimised the data testing process in a number of key ways:

Accelerated project development: Using Cursor AI, we rapidly moved through development and completed the project in a short period. The AI-driven interface, combined with Playwright’s capabilities, sped up our debugging process—a major challenge in previous R&D projects. In the past, we often faced delays due to debugging blockers, but now, with the AI assistant, we could directly identify and fix issues, completing the project in a fraction of the time.
Built a robust, reusable tool: The tool is scalable and flexible, and can be adapted for different analytics platforms (e.g., Google Analytics, Meta, Pinterest). It is reusable across different projects and client needs, as well as different browsers and environments.
Time efficiency & boosted productivity: One of the most valuable outcomes was the significant reduction in manual testing time. With the new automated testing tool, we ran multiple test cases simultaneously, speeding up the overall process. This helped us meet tight project deadlines and improve client delivery without sacrificing quality. Additionally, it freed up time for focusing on challenging tasks and optimising existing solutions.

Conclusion:

With AI, the classic engineering view of ‘why spend 1 hour doing something when I can spend 10 hours automating it?’ has now become ‘why spend 1 hour doing something when I can spend 2-3 hours automating it?’. In this instance, Cursor allowed us to lower the barrier for innovation and create a tool to meet a set of tight deadlines, whilst also giving us a feature-filled, reusable program moving forwards.

For more information about how we can support your organisation with data testing – including our automated testing services – please feel free to contact us now or explore the links below

CONTACT USVIEW OUR CAPABILITIESVIEW OUR SERVICES

The post Automated testing: Developing a data testing tool using Cursor AI appeared first on Lynchpin.

(Video) Applying RegEx filtering in Looker Studio to clean up and standardise GA4 reporting 15 Nov 2024 7:36 AM (11 months ago)

In the latest episode of our ‘Calling Kevin’ video series, we show you how to clean up and filter URLs using a few simple expressions in Looker Studio.

By applying these Regular Expressions (RegEx), you can easily remove duplicates, fix casing issues, and tidy up troublesome URL data to standardise GA4 reporting – just as you would have been able to in Universal Analytics.

Expressions used:

To remove parameters from a page path: REGEXP_EXTRACT(Page, “^([\\w-/\\.]+)\\??”)
To remove trailing slash from a page path: REGEXP_REPLACE(Page, “(/)$”, “”)
To make a page path lowercase: LOWER()
Combined: LOWER(REGEXP_REPLACE(REGEXP_EXTRACT(Page path + query string, “^([\\w-/\\.]+)\\??”), “(/)$”, “”))

For more quick GA4 tips, be sure to check out other videos from our ‘Calling Kevin’ series.

To find out how Lynchpin can help

CONTACT USVIEW OUR GOOGLE ANALYTICS SUPPORT SERVICES

The post (Video) Applying RegEx filtering in Looker Studio to clean up and standardise GA4 reporting appeared first on Lynchpin.

Webinar: Navigating Recent Trends in Privacy, Measurement & Marketing Effectiveness 3 Oct 2024 1:54 AM (last year)

How do you know what’s working and not working and plan for success as the tides of digital measurement continue to change?

The themes of privacy, measurement and marketing effectiveness triangulate around a natural trade and tension: balancing the anonymity of our behaviours and preferences against the ability for brands to reach us relevantly and efficiently.

In this briefing our CEO, Andrew Hood, gives you a practical and independent view of current industry trends and how to successfully navigate them.

Want the full-length white paper?

Building on the themes introduced in the webinar, our white paper lays out an in-depth look at the privacy trends, advanced measurement strategies, and balanced approach you can take to optimise marketing effectiveness.

Unlock deep-dive insight and practical tips you can begin implementing today to guide your focus over the coming months.

VIEW NOW

To access a copy of the slides featured in the webinar, click the button below

View presentation slides

The post Webinar: Navigating Recent Trends in Privacy, Measurement & Marketing Effectiveness appeared first on Lynchpin.

Benefits of marketing mix modelling: Why is MMM so popular right now? 5 Sep 2024 5:01 AM (last year)

Introduction

The concept of marketing mix modelling (often referred to as just ‘MMM’) has been around for a while – as early as the 1960s in fact – which should be no surprise, as the business challenge of what marketing channels to use and where best to spend your money has always been the essence of good marketing, at least if somebody is holding you accountable for that spend and performance!

Marketing mix modelling has its foundations in statistical techniques and econometric modelling, which still holds largely true today. However, the mix of channels and advancements in end-to-end analytics create new challenges to be tackled, not least the expectations of what MMM is and what it can deliver.
In reality, there are various analytics techniques that can be undertaken to answer the overall business question: ‘how do my channels actually impact sales?’. In this blog we will answer some common questions about MMM, address some common (comparable) techniques, and share how and when you might look to choose one method over another.

What is marketing mix modelling?

MMM is a statistical technique, with its roots in regression, that aims to analyse the impact of various marketing tactics on sales over time (other KPIs are also available!). Marketing mix modelling will consider all aspects of marketing to do this, such as foundational frameworks like ‘The 4 Ps of Marketing’ (Product, Price, Place and Promotion).

MMM is similar to econometric modelling in terms of techniques used, however there are some key differences. On the whole, econometrics is broader in its considerations and applications, often encompassing aspects of general economic factors in relation to politics, international trade, public policy and more. MMM, on the other hand, focusses more specifically on marketing activities and their impact on business outcomes.

You might also come across the term ‘media mix modelling’ (with the same unhelpful acronym, ‘MMM’). Much like econometrics, media mix modelling tends to differ from marketing mix modelling due to its scope and general objective. Media mix modelling tends to have an even narrower focus than marketing mix modelling; As the name implies, it’s aimed more specifically on optimising a mix of media channels, focussing on optimising advertising spend.

Whether its marketing mix modelling or media mix modelling you are looking at, the key is to consider the business question you are looking to answer and ensure your model is trained using the best input variables to answer that question – Nothing new in the world of a good analytics project!

Why is MMM seeming to gain traction recently?

In recent years, the general trend has been to measure everything, integrate everything, and to link all of your data together, leaving no doubt about who did what, when, and to what end. However increasing concerns (or at least considerations) around data privacy and ethics has caused marketers to take a second look at how they collect and utilise their data.
There is a growing need to adapt to new privacy regulations, but also a greater desire to respect an individual’s privacy and find better ways to understand what marketing activities drive positive or negative outcomes.

With limitations on the ability to track 3rd party cookies, approaches such as marketing attribution may become more difficult to implement, although the effectiveness of these data sources is in itself doubtful. And with consent management becoming increasingly granular, even 1st party measurement can leave gaps in your data collections.
However, the power that marketing attribution gave marketers is well recognised now and the desire to continue to be data-led is only increasing. Machine learning has become a commonplace tool in beginning to fill the gaps that are creeping back in to the tracking of user behaviour. Organisations are also increasingly eager to build on the power of what they have learnt with these joined-up customer journeys, and there is that need again to look across the whole of marketing, not just these digital touchpoints, and replicate that approach in a more holistic way.

So in summary, while marketing mix modelling has never gone away, it is now seeing a revival as an essential tool in a marketer’s toolbelt.

The benefits of MMM: Why should organisations consider using marketing mix modelling services?

MMM is a great tool for any organisation looking to be more data led in their approach to planning and analysis of marketing activities. Key benefits of MMM include:

Ability to measure and optimise the effectiveness of marketing and advertising campaigns:
The purpose of MMM is to measure the impact of your marketing activities on your business outcomes. A well-built marketing mix model will enable you to quantify ROI by channel and make better data-led decisions on the mix of marketing activities that will lead to more optimised campaigns.

Natural adeptness at cross-channel insights:
With increasing limitations on tracking users across multiple channels the methodology for MMM neatly side steps these restrictions by using data at an aggregated level. By its very nature it doesn’t require linking user identities across different devices or tracking individuals using offline channels.

Enables more strategic planning and budgeting:
MMM provides data-driven insight to inform budget planning processes. Its outputs are transparent, allowing organisations to understand the impact each of their channels have on business outcomes and how those channels influence each other within the mix. By incorporating MMM with other tools for scenario planning, spend optimisation and forecasting, organisations can better understand what happened in the past to plan more effectively for the future.

Can be used when granular level data is not available:
As mentioned earlier, MMM works with data at an aggregated level. This offers more flexibility when looking to integrate data inputs into your decision making such as:

Linking offline activity with online sales
Linking online activity with offline sales
Understanding impact of external influences such as macroeconomic factors, seasonality, competitor activities etc

Has a longer-term focus:
MMM is a powerful technique for longer term planning and assessing the impact of campaigns that don’t necessarily provide immediate impact (e.g. brand awareness campaigns, TV, and display advertising etc). By incorporating MMM into a measurement strategy, businesses can ensure longer-term activity is appropriately considered.

Marketing mix modelling vs. marketing attribution modelling: How do they differ? What are the pros and cons?

Earlier in this blog we looked at how marketing mixed modelling compares to econometrics and media mixed modelling. Another very important modelling approach to consider when looking at marketing effectiveness is marketing attribution.
Marketing attribution differs from marketing mix modelling in a number of important ways – most importantly by relying on a more granular approach. It looks to assign weightings to each individual touchpoint on the customer journey, incorporating each user’s journey and determining whether that journey leads to a successful conversion or not.
This very detailed understanding of how each customer interacts with your channels can be very powerful, but also very complex and time consuming to both collect and analyse; In addition with the increasing limitations on tracking individuals without their consent, you may end up having to rely on only a partial picture of the user journey.
While of course it is possible to model on a subset of data, you would need to be careful that the user journey you are looking to understand is not unfairly weighted to those channels (or individuals) that are easier to track.
Marketing attribution also uses a wider range of modelling algorithms, from the simple (linear, time-decay) to the more complex (Markov Chains, Game Theory, ML models). This range of models to select from can be both a benefit and a hindrance, with difficulties arising when you’re not sure what marketing attribution model will suit your business needs best.

Marketing mix modelling does have its own drawbacks to consider too. The biggest consideration when determining if MMM is suitable for you is to understand how much historical data you have.
While a marketing attribution model can work on just a few months of data, so long as it has decent volume and is fairly representative of your typical user journeys, MMM relies on trends over longer periods of time – typically a minimum of 2 years’ worth of data is advised before undertaking an MMM project. MMM also works best when looking at the broader impacts marketing has on your goals. Therefore, if you need to analyse specific campaign performance or delve deeper into specific channels, then marketing attribution will be the better bet.

Can MMM and marketing attribution complement each other?

In a previous blog, we discussed the merits of using both marketing attribution and MMM side by side to provide a more powerful and comprehensive understanding of marketing effectiveness.
While a marketing attribution model will focus on individual touchpoints and their contributions, MMM will take a holistic view, considering the overall impact of marketing inputs. By combining these two approaches, marketers can gain a more complete picture of how different marketing elements work together to drive business outcomes and the demystify the balance they needed across marketing activity for maximum business performance.

Summary

Marketing mix modelling is very a powerful and well-established statistical technique. Most marketers should be at least exploring the benefits and insight it provides into the relationship between marketing activity and business performance to optimise planning and decision making.

Some barriers to entry in starting an MMM project can be navigating what may appear to be a complex set of approaches and techniques. While variations of MMM do exist – econometrics, marketing mix modelling, and media mix modelling – the key difference lies in the scope and objective of the business question you aim to answer. Successfully choosing and developing a model depends on fully understanding your business needs and the data available to you. Investing time upfront to determine what you are looking to achieve is essential in getting the right outcomes.

MMM is best used for strategic planning and determining longer term impacts of your marketing activities. Therefore, if you require more in-depth campaign and channel analysis, then marketing attribution may be more suitable for your business needs. However, it’s important to note that MMM and marketing attribution can work side by side to develop a more complete picture of your marketing activities. While MMM allows greater flexibility when working with a mix of channels that are both tracked and not tracked, the ability of marketing attribution to provide a more granular analysis of your marketing journeys, channels, and campaigns allows for day-to-day optimisation of your marketing activities alongside the longer-term strategy set out by your MMM insights.

If you are ready to explore MMM, marketing attribution, or anything in between, we’d be delighted to discuss your needs in more detail.

For more information on boosting marketing effectiveness and how our team can assist with topics raised in this blog, please visit the links below

Register for our Webinar: ‘Navigating Recent Trends in Privacy, Measurement & Marketing Effectiveness’ on 25 Sept 2024VIEW OUR CAPABILITIESVIEW OUR SERVICESCONTACT US

The post Benefits of marketing mix modelling: Why is MMM so popular right now? appeared first on Lynchpin.

Google (finally) supports Custom Event Data Import in GA4 17 Jul 2024 3:06 AM (last year)

Decorative graphic to illustrate Custom Event Data Import in GA4

Google has recently updated their GA4 ‘Data Import’ feature to finally support Custom Event metadata. This is a significant development, but before we dive in, let’s remind ourselves of a key point: Despite its name, which can give false hope after an outage, ‘Data Import’ is NOT a solution for repopulating lost data. It is however a powerful tool for augmenting existing data with information that isn’t directly collected in GA4. Common sources that we find our clients wanting to integrate include CRM systems, offline sales data, or other third-party analytics tools.

When would Custom Event Data Import be useful?

Well, there are many cases:

Post-collection data availability

The information we import might not be available until after collection. This could include data that is processed or generated by third-party tools after the event has already occurred. A prime example would be cost data for non-Google ad clicks and impressions.

Sensitive data handling

Some information might not be something we want exposed on our site. Importing such data ensures it remains secure and is only used for internal analysis. These might include things like a product’s wholesale price, or a user’s lifetime customer value.

Offline data integration

Information collected offline, such as in-store purchases or interactions, could be integrated with your existing GA4 data to allow for a more complete view of customer behaviour across both online and offline touchpoints.

Although Data Import supported Cost, Product, and User-scoped data, what was conspicuously absent up until now was the ability to import data directly scoped to existing Custom Events. This is particularly significant because, as Google likes to remind us, GA4 is ultimately event-based.

To understand if this development could be useful for you, consider the events you already track. Is there any information directly related to these events and their custom dimensions that you don’t collect in GA4, but have available offline or in another tool? If so, Custom Event data import could be very handy.

It’s been a long and somewhat painful journey with GA4, but it’s great to see it gradually becoming feature complete.

Of course, if you’re looking to augment your GA4 data with information available at the point of collection, Lynchpin would recommend harnessing the power of a server-side GTM implementation to augment your GA4 data before it even arrives to GA4 itself.

For more information on server-side GTM and its advantages we highly recommend reading the blogs below:

6 ways server-side tracking can help supercharge your website

Blog

by Gareth Case

0 Comments9 Minutes

Server-side GTM: A new data collection paradigm with a familiar face

Blog

by Lewis Walker

0 Comments12 Minutes

To discuss any of the topics mentioned in this blog or to find out how Lynchpin can support you with any other data and analytics query, please do not hesitate to reach out to a member of our team.

For more information about what we do

VIEW OUR CAPABILITIESVIEW OUR SERVICESCONTACT US

The post Google (finally) supports Custom Event Data Import in GA4 appeared first on Lynchpin.

Working with dbt & BigQuery: Some issues we encountered and their solutions 2 Jul 2024 12:51 AM (last year)

Here at Lynchpin, we’ve found dbt to be an excellent tool for the transformation layer of our data pipelines. We’ve used both dbt Cloud and dbt Core, mostly on the BigQuery and Postgres adapters.

We’ve found developing dbt data pipelines to be a really clean experience, allowing you to get rid of a lot of boilerplate or repetitive code (which is so often the case writing SQL pipelines!).

It also comes with some really nice bonuses like automatic documentation and testing along with fantastic integrations with tooling like SQLFluff and the VSCode dbt Power User extension.

As with everything, as we’ve used the tool more, we have found a few counter-intuitive quirks that left us scratching our heads a little bit, so we’d thought we share our experiences!

All of these quirks have workarounds, so we’ll share our thoughts plus the workarounds that we use.

Summary:

Incremental loads don’t work nicely with wildcard table queries in BigQuery
The sql_header() macro is the only way to do lots of essential things and isn’t really fit for purpose.
Configuring dev and prod environments can be a bit of a pain

1. Incremental loads with wildcard tables in BigQuery

Incremental loads in dbt are a really useful feature that allows you to cut down on the amount of source data a model needs to process. At the cost of some extra complexity, they can vastly reduce query size and the cost of the pipeline run.

For those who haven’t used it, this is controlled through the is_incremental() macro, meaning you can write super efficient models like this.

SELECT *
FROM my_date_partitioned_table
{% if is_incremental() %}
WHERE date_column > (SELECT MAX(date_column) FROM {{ this }}
{% endif %}

This statement is looking at the underlying model and finding the most recent data based on date_column. It then only queries the source data for data after this. If the table my_date_partitioned_table is partitioned on date_column, then this can have massive savings on query costs.

Here at Lynchpin, we’re often working with the GA4 → BigQuery data export. This free feature loads a new BigQuery table events_yyyymmdd every day. You can query all the daily export tables with a wildcard * and also filter on the tables in the query using the pseudo-column _TABLE_SUFFIX

SELECT
*
FROM
`lynchpin-marketing.analytics_262556649.events_*`
WHERE
_TABLE_SUFFIX = '20240416';

The problem is incremental loads just don’t work very nicely with these wildcard tables – at least not in the same way as a partitioned table in the earlier example.

-- This performs a full scan of every table - rendering
-- incremental load logic completely useless!
SELECT *,
_TABLE_SUFFIX as source_table_suffix
FROM `lynchpin-marketing.analytics_262556649.events_*`
{% if is_incremental() %}
WHERE _TABLE_SUFFIX > (SELECT MAX(source_table_suffix) FROM {{ this }}
{% endif %}

This is pretty disastrous because scanning every daily table in a GA4 export can be an expensive query, and running this every time you load the model doesn’t have great consequences for your cloud budget .

The reason this happens is down to a quirk in the query optimiser in BigQuery – we have a full explanation and solution to it at the end of this blog if you want to fix this yourself.

**2. You have to rely on the sql_header() macro quite a lot**

The sql_header() macro is used to run SQL statements before the code block of your model runs, and we’ve actually found it to be necessary in the majority of our models. For instance, you need it for user defined functions, declaring and setting script variables, and for the solution to quirk #1.

The problem is that sql_header() macro isn’t really fit for purpose and you run into a few issues:

You can’t use macros or Jinja inside sql_header() as it can lead to weird or erroneous behaviour, so no using ref, source or is_incremental() for example
You can’t include your sql_header() configurations in tests, currently meaning any temporary functions created can’t be recreated in test cases

3. There are a few workarounds needed to truly support a dev and prod environment

dbt supports different environments, which can be easily switched at runtime using the —target command line flag. This is great for keeping a clean development environment separate from production.

One thing we did find a little annoying was configuring different data sources for your development and production runs, as you probably don’t want to have to run on all your prod data every time you run your pipeline in dev. Even if you have incremental loads set up, a change to a table schema soon means you need to run a full refresh which can get expensive if running on production data.

One solution is reducing amount of data using a conditional like so:

{% if target.name == 'dev' %}
        AND date_column BETWEEN
        TIMESTAMP('{{ var("dev_data_start_date") }}')
        AND TIMESTAMP('{{ var("dev_data_end_date") }}')
{% endif %}

This brings in extra complexity to your codebase and is annoying to do for every single one of your models that query a source.

The best solution we saw to this was here: https://discourse.getdbt.com/t/how-do-i-specify-a-different-schema-for-my-source-at-run-time/561/3

The solution is to create a dev version of each source in the yaml file, called {model name_source}_dev (e.g. my_source_dev for the dev version of my_source) and then have a macro that switches which source based on the target value at runtime.

Another example in this vein is getting dbt to enforce foreign key constraints requires this slightly ugly expression switching between schemas in the schema.yaml file

- type: foreign_key
  columns: ["blog_id"]
  expression: "`lynchpin-marketing.{ 'ga4_reporting_pipeline' if target.name!='dev' else 'ga4_reporting_pipeline_dev' }}.blogs` (blogs)"

Explanation and solution to quirk #1

Let’s revisit

SELECT
  *
FROM
  `lynchpin-marketing.analytics_262556649.events_*`
WHERE
  _TABLE_SUFFIX = '20240416';

This is fine – the table scan performed here only scans tables with suffix equal to 20240416 (i.e. one table), and bytes billed is 225 KB

OK, so how about only wanting to query from the latest table?

If we firstly wanted to find out the latest table in the export:

-- At time of query, returns '20240416'
SELECT
    MAX(_TABLE_SUFFIX)
  FROM
    `lynchpin-marketing.analytics_262556649.events_*`

This query actually has no cost!

Great, so we’ll just put that together in one query:

SELECT
  *
FROM
  `lynchpin-marketing.analytics_262556649.events_*`
WHERE
  _TABLE_SUFFIX = (
  SELECT
    MAX(_TABLE_SUFFIX)
  FROM
    `lynchpin-marketing.analytics_262556649.events_*`)

Hang on… what!?

BigQuery’s query optimiser isn’t smart enough to get the value of the inner query first and use that to reduce the scope of tables queried in the outer query

Here’s our solution, which involves a slightly hacky way to ensure the header works in both incremental and non-incremental loads. We implemented this in a macro to make it reusable.

{% call set_sql_header(config) %}
DECLARE table_size INT64;
DECLARE max_table_suffix STRING;
SET table_size = (SELECT size_bytes FROM {{ this.dataset }}.__TABLES__ WHERE table_id='{{ this.table }}');
IF table_size > 0 THEN SET max_table_suffix = (select MAX(max_table_suffix) FROM {{ this }});
ELSE SET max_date = '{{ var("start_date") }}';
END IF;
{% endcall %}
-- Allows for using max_table_suffix to filter source data.
-- Example usage:
SELECT
        *
FROM {{ source('ga4_export', 'events') }}
{% if is_incremental() %}
    WHERE _table_suffix > max_table_suffix
{% endif %}

We hope you found this blog useful. If you happen to use any of our solutions or come across any strange quirks yourself, we’d be keen to hear more!

To find out how Lynchpin can support you with data transformation, data pipelines, or any other measurement challenges, please visit our links below or reach out to a member of our team.

To find out how Lynchpin can help

VIEW OUR CAPABILITIESVIEW OUR SERVICESCONTACT US

The post Working with dbt & BigQuery: Some issues we encountered and their solutions appeared first on Lynchpin.

Lynchpin View RSS

Unfiltered: Adobe Analytics – Key takeaways from session #1 9 Oct 3:10 AM (15 days ago)

Adobe CJA usage; restoring shaken trust, securing buy-in, supporting data enablement, self-serve, and governance.

1) CJA is promising but adoption remains limited

2) Vendor trust matters – Have recent public mix-ups shaken confidence?

3) Mastering buy-in: Creating business cases with clear user benefits and measurable hooks

4) Supporting enablement and self-serve

5) Building a foundation of trust with governance and strong communication

What’s next for you analytics strategy?

Want personalised support?

20 Years of Analytics: What Does the Data Say? 5 Sep 1:29 AM (last month)

Web Analytics Goes Full Circle

Measuring Behavioural Shifts

Follow The Money

Regulatory Lag

Waves of AI

Interfaces are Key

Context is Value

The Next 20 Years

View more from Lynchpin

View more from Lynchpin

(Video) How to customise your Report library in GA4 20 Jun 9:01 AM (4 months ago)

To find out how Lynchpin can help

To find out how Lynchpin can help

Data pipelines using Luigi – Strengths, weaknesses, and some tops tips for getting started 1 May 8:38 AM (5 months ago)

What is Luigi and where does it sit in the data pipelines market?

What we like

What we don’t like

Our top tips for getting started: (Data pipelines using Luigi)

Conclusion:

For more information on how we can support your organisation with data pipelines and data orchestration – including custom builds, pipeline management, debugging and testing, and optimisation services – please feel free to contact us or explore the links below

For more information on how we can support your organisation with data pipelines and data orchestration – including custom builds, pipeline management, debugging and testing, and optimisation services – please feel free to contact us or explore the links below

Automated testing: Developing a data testing tool using Cursor AI 11 Dec 2024 8:17 AM (10 months ago)

Project Background:

Project Aim:

Automated Testing Benefits:

How We Did It:

Our Experience with Cursor:

What We Achieved – Efficiencies Unlocked:

Conclusion:

For more information about how we can support your organisation with data testing – including our automated testing services – please feel free to contact us now or explore the links below

For more information about how we can support your organisation with data testing – including our automated testing services – please feel free to contact us now or explore the links below

(Video) Applying RegEx filtering in Looker Studio to clean up and standardise GA4 reporting 15 Nov 2024 7:36 AM (11 months ago)

To find out how Lynchpin can help

To find out how Lynchpin can help

Webinar: Navigating Recent Trends in Privacy, Measurement & Marketing Effectiveness 3 Oct 2024 1:54 AM (last year)

Want the full-length white paper?

Benefits of marketing mix modelling: Why is MMM so popular right now? 5 Sep 2024 5:01 AM (last year)

Introduction

What is marketing mix modelling?

Why is MMM seeming to gain traction recently?

The benefits of MMM: Why should organisations consider using marketing mix modelling services?

Marketing mix modelling vs. marketing attribution modelling: How do they differ? What are the pros and cons?

Can MMM and marketing attribution complement each other?

Summary

For more information on boosting marketing effectiveness and how our team can assist with topics raised in this blog, please visit the links below

For more information on boosting marketing effectiveness and how our team can assist with topics raised in this blog, please visit the links below

Google (finally) supports Custom Event Data Import in GA4 17 Jul 2024 3:06 AM (last year)

When would Custom Event Data Import be useful?

6 ways server-side tracking can help supercharge your website

Server-side GTM: A new data collection paradigm with a familiar face

For more information about what we do

For more information about what we do

Working with dbt & BigQuery: Some issues we encountered and their solutions 2 Jul 2024 12:51 AM (last year)

1. Incremental loads with wildcard tables in BigQuery

2. You have to rely on the sql_header() macro quite a lot

3. There are a few workarounds needed to truly support a dev and prod environment

Explanation and solution to quirk #1

To find out how Lynchpin can help

To find out how Lynchpin can help

**2. You have to rely on the sql_header() macro quite a lot**