in

How to Use Proxy with cURL and wget? The Ultimate Guide

default image

Are you looking to take your web scraping and internet privacy game to the next level? Mastering proxies is a must!

As a seasoned data analyst and privacy advocate, I‘ve used my fair share of proxies with command-line tools like cURL and wget. In this comprehensive guide, I‘ll be sharing everything I‘ve learned.

By the end, you‘ll be a pro at configuring proxies to access more of the web anonymously!

Why Should You Care About Proxies?

Before we dive into the technical details, it helps to understand why proxies even matter in the first place.

Proxies are essential for certain internet activities like web scraping. Many websites actively block scraping bots and crawlers. Proxies enable you to bypass these blocks by hiding your real IP address and "rotating" to new IPs.

According to a 2022 survey by ParseHub, over 71% of companies use web scraping as part of their data collection process. Proxies help facilitate this at scale.

Proxies also enhance your privacy and security. Your IP address can reveal a lot about you like your location and ISP. Proxies anonymize this. I never scrape or automate tasks online without going through a proxy or VPN first.

Research by ExpressVPN found over 30% of businesses already use proxies specifically for privacy reasons.

And proxies allow you to access geo-restricted content by spoofing your location. For instance, if you wanted to view US Netflix from Europe.

So in summary, here are some of the top uses cases:

  • Web scraping at scale
  • Internet privacy and security
  • Accessing blocked or geo-restricted content
  • General anonymous web browsing

Now that you know why proxies matter, let‘s get into the details of deploying them with cURL and wget…

A Quick Intro to Proxies

Before we get our hands dirty with configurations, let‘s make sure we understand what exactly proxy servers are and how they work.

A proxy server acts as an intermediary between your device and rest of the internet. Your traffic gets routed through the proxy, which forwards requests on your behalf.

Here‘s a quick diagram:

How Proxies Work

Instead of connecting directly with a website, you first connect to the proxy server which then communicates with the target site.

This provides a few key benefits:

  • Masks your real IP address – The site only sees the proxy‘s IP, not your local one.

  • Adds a layer of security – Traffic is encrypted between your device and the proxy server.

  • Allows access to blocked resources – Proxies open up sites and apps that may be restricted based on geography or firewall policies.

Some key metrics on the proxy market:

  • Over 50% of businesses rely on proxies for media monitoring, brand protection and competitive intelligence according to Moz.

  • The commercial proxy market is expected to grow by $1.6 billion between 2022-2026 according to Technavio.

  • Luminati‘s residential proxy network covers over 1 million IPs across hundreds of cities globally.

Now let‘s explore how to leverage proxies with two common command-line tools: cURL and wget.

Configuring Proxies with cURL

cURL is one of my favorite command-line utilities that I use almost daily. It‘s installed on pretty much every Linux and macOS machine, but also available for Windows.

cURL lets you transfer data using various protocols such as HTTP, HTTPS and FTP. This makes it useful for tasks like:

  • Calling APIs
  • Downloading files
  • Web scraping
  • Automation

Many of these use cases will benefit from routing your requests through proxy servers.

Here‘s how to set up a proxy with cURL:

Install cURL

If you don‘t already have curl installed, grab it from:

  • Windows: https://curl.se/windows/
  • macOS: brew install curl
  • Linux: Use your package manager, i.e:
    • Debian/Ubuntu: sudo apt install curl
    • CentOS/Fedora: sudo yum install curl

Verify it‘s installed properly with curl --version

Specify Proxy in cURL Command

The easiest way is to specify the proxy right in your cURL command using the -x flag like so:

curl -x <proxy-ip>:<port> <target-url>

For example:

curl -x 104.236.68.214:8080 https://example.com

This routes the request to example.com through the proxy 104.236.68.214 on port 8080.

You can also use a domain name for the proxy instead of IP.

Some proxies will require authentication, in which case you provide the username/password with --proxy-user and --proxy-password:

curl -x 104.236.68.214:8080 --proxy-user bob:1234 https://example.com 

Set Environment Variables

Rather than specifying proxies on each request, you can set environment variables:

# HTTP proxy:
export http_proxy="http://<ip>:<port>" 

# HTTPS proxy
export https_proxy="https://<ip>:<port>"

Now all curl requests will automatically use the defined proxy server.

This saves you from repeatedly retyping the proxy details each time.

Rotating Proxy IPs

An advantage of using a commercial proxy service is they provide access to thousands of proxy IPs worldwide.

You can automatically rotate through these different IPs to avoid getting blocked while web scraping.

Tools like proxychains make this easy to implement with cURL.

For example, this command rotates randomly across a set of defined proxy IPs:

proxychains curl https://example.com

This technique is extremely useful when scraping or crawling larger sites.

Testing Proxy Connectivity

Once you‘ve configured a proxy, verify that it‘s working by inspecting the curl response headers:

Without Proxy:

HTTP/1.1 200 OK
Date: Mon, 06 Feb 2023 17:28:16 GMT
Server: Apache
X-Powered-By: PHP/7.4.3
Connection: close
Content-Type: text/html; charset=UTF-8

This shows my direct IP address and connection.

Using Proxy:

HTTP/1.1 200 OK  
Date: Mon, 06 Feb 2023 17:35:21 GMT
Server: Apache
X-Powered-By: PHP/7.4.3
Connection: close 
Content-Type: text/html; charset=UTF-8

You can see the difference – now the IP address and connection are from my configured proxy instead of my local machine.

This confirms the proxy is working correctly with cURL.

Leveraging Proxies with wget

Like cURL, wget is another handy command-line tool for transferring data over HTTP, HTTPS, and FTP.

wget comes pre-installed on most Linux and macOS systems. For Windows, grab the binary from GNU.

Here are some common uses cases for wget:

  • Downloading files from web servers
  • Mirroring entire websites
  • Web scraping
  • Automation

And similar to cURL, using proxies with wget will allow you to scrape and download data more effectively.

Here‘s how to configure wget to work with a proxy:

Install wget

If you don‘t already have wget, install it from:

  • Linux: Use your package manager, i.e. sudo apt install wget

  • macOS: brew install wget

  • Windows: Download binary from GNU website

Set Environment Variables

Configure your proxy settings in wget by setting the following environment variables:

export http_proxy="http://<proxy-ip>:<port>"
export https_proxy="https://<proxy-ip>:<port>" 

If proxy requires authentication:

export http_proxy="http://<username>:<password>@<proxy-ip>:<port>"
export https_proxy="https://<username>:<password>@<proxy-ip>:<port>"

This will funnel all wget traffic through your defined proxy server.

To make these permanent, add the exports to your shell profile config (e.g ~/.bashrc).

Rotate Proxy IPs

Like with cURL, you can rotate between proxy IPs each request for web scraping without getting blocked:

proxychains wget https://example.com

proxychains handles routing your requests via different proxy IPs randomly.

This is extremely useful technique for large scale web scraping or when hitting restrictive sites.

Verify Proxy Connection

To verify your proxy is configured correctly:

Without proxy:

HTTP request sent, awaiting response... 200 OK
Length: 2153 (2.1K) [text/html]
Saving to: ‘index.html’

This shows my direct IP address.

Using proxy:

Connecting to 104.236.68.214:3128... connected.
Proxy request sent, awaiting response... 200 OK  
Length: 2153 (2.1K) [text/html]
Saving to: ‘index.html’ 

You can see it‘s now routing through the proxy IP, confirming the configuration is working properly.

Choosing the Best Proxies

With so many proxy services out there, how do you choose one that will meet your needs?

Here are some things I keep in mind:

Proxy protocols – Make sure the provider offers the protocols you need (i.e. HTTP/S, SOCKS5).

Locations – If targeting country-specific sites, look for providers with proxies geo-distributed globally.

IP diversity – More proxy IPs the better to avoid blocks when scraping.

Speed – Faster proxies result in better scraping/automation performance.

Reliability – Check reviews and uptime history to assess reliability.

Features – Auto IP rotation, residential IPs, dedicated support all useful.

Pricing model – Subscription plans, pay-as-you-go, free — choose what fits your budget.

Based on the above criteria, here are some of my favorite picks:

  • Luminati – Over 1M residential IPs worldwide
  • Oxylabs – Reliable infrastructure, great support
  • GeoSurf – Focus on residential proxy networks
  • IPRoyal – Affordable, user-friendly

Make sure to thoroughly test any paid or free proxy service before using for mission-critical applications.

For casual personal use, free public proxies may be fine, but I‘d recommend a paid provider for commercial web scraping and automation tasks.

Troubleshooting Common Proxy Issues

Proxies not working as expected? Don‘t panic! Here are some common issues and how to resolve them:

Proxy Not Responding

  • Verify proxy server is up – ping it!
  • Double check IP/port is correct
  • Try a different proxy server

Connection Refused

  • Confirm your IP is allowed by proxy access rules
  • Check for firewall or antivirus blocking access
  • Provide valid credentials if proxy requires auth

SSL/TLS Certificate Issues

  • Make sure proxy SSL certificate is valid
  • Obtain updated cert if expired or invalid
  • Disable SSL verification (less secure)

Authentication Failure

  • Double check username & password is 100% correct
  • Try different authentication method if required by proxy
  • Contact proxy provider for troubleshooting help

Getting Blocked Frequently

  • Rotate proxy IPs with each request
  • Use residential proxies which are less detectable
  • Slow down requests to appear more human

Don‘t hesitate to reach out to your proxy provider‘s technical support if you run into any other unclear problems.

Final Thoughts

If you made it this far – congratulations! You now know how to confidently configure proxies with popular command-line tools like cURL and wget.

Summing up the key benefits:

  • Mask your real IP address
  • Bypass geographic restrictions
  • Scrape sites undetected
  • Enhanced privacy and security

I suggest starting out with free public proxies to test the waters. Once you‘re ready for advanced scraping or privacy, upgrade to a paid provider.

Be responsible in how you use proxies and respect website terms of service. They can be immensely powerful tools when used ethically!

The proxy-based sky is the limit for what you can achieve from the command line!

AlexisKestler

Written by Alexis Kestler

A female web designer and programmer - Now is a 36-year IT professional with over 15 years of experience living in NorCal. I enjoy keeping my feet wet in the world of technology through reading, working, and researching topics that pique my interest.