Skip to content

Commit

Permalink
feat: implement ticket based F3 participation lease (#12531)
Browse files Browse the repository at this point in the history
* Implement ticket based F3 participation lease

Implemented enhanced ticket-based participation system for F3 consensus
in `F3Participate`. This update introduces a new design where
participation tickets grant a temporary lease, allowing storage
providers to sign as part of the F3 consensus mechanism. This design ensures that
tickets are checked for validity and issuer alignment, handling errors
robustly. If there's an issuer mismatch, the system advises miners to
retry with the existing ticket. If the ticket is invalid or expired,
miners are directed to obtain a new ticket via
`F3GetOrRenewParticipationTicket`.

Fixes filecoin-project/go-f3#599

* Use fresh timer every time for F3 backoffs

To avoid potential of deadlock in case f3Participator is used from
multiple goroutines use throw-away timers at the price of higher GC.

Also use the cancel function in context explicitly in a unified stop
hook that awaits the participation to end before exiting.

* Strictly require start instance to never decrease

Require the start instance of a participation to never decrease if there
 is an existing lease by the miner.

* feat(f3): update go-f3 to 0.7.0 and adapt for changes to the API

* feat(f3): Include the network name in the lease

That way we don't re-use leases across networks. It's a bit racy (we ask
for the manifest before we ask for the current progress) but it should
be fine because at least we won't create a lease for the new network
with a future instance.

There's still an ABA problem if we rapidly switch back and forth between
two networks but... let's just not do that? At least for the mainnet
switchover, that won't be an issue because we enforce a 900 epoch
silence period.

I have to say, I'm not happy about this. But... we can probably just
hard-code it in the future once we get rid of the dynamic manifest.

* Handle not ready error gracefully in participator

Back off and get a fresh token if F3 is not ready.

---------

Co-authored-by: Steven Allen <[email protected]>
  • Loading branch information
masih and Stebalien authored Oct 8, 2024
1 parent 1b9b815 commit a0d5292
Show file tree
Hide file tree
Showing 22 changed files with 1,774 additions and 850 deletions.
1 change: 1 addition & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,7 @@
* Added `StateMinerInitialPledgeForSector` RPC method and deprecated existing `StateMinerInitialPledgeCollateral` method. Since ProveCommitSectors3 and ProveReplicaUpdates3, sector onboarding no longer includes an explicit notion of "deals", and precommit messages no longer contain deal information. This makes the existing `StateMinerInitialPledgeCollateral` unable to properly calculate pledge requirements with only the precommit. `StateMinerInitialPledgeForSector` is a new simplified calculator that simply takes duration, sector size, and verified size and estimates pledge based on current network conditions. Please note that the `StateMinerInitialPledgeCollateral` method will be removed entirely in the next non-patch release. ([filecoin-project/lotus#12384](https://github.com/filecoin-project/lotus/pull/12384)
* Implement [FIP-0081](https://github.com/filecoin-project/FIPs/blob/master/FIPS/fip-0081.md) and its migration for NV24. Initial pledge collateral will now be calculated using a 70% / 30% split between "simple" and "baseline" in the initial consensus pledge contribution to collateral calculation. The change in this calculation will begin at NV24 activation and ramp up from the current split of 100% / 0% to the eventual 70% / 30% over the course of a year so as to minimise impact on existing operations. ([filecoin-project/lotus#12526](https://github.com/filecoin-project/lotus/pull/12526)
* Update to F3 0.4.0 ([filecoin-project/lotus#12547](https://github.com/filecoin-project/lotus/pull/12547)). This includes additional performance enhancements and bug fixes.
* [Ticket-based F3 participation API](https://github.com/filecoin-project/lotus/pull/12531): This update introduces a new design where participation tickets grant a temporary lease, allowing storage providers to sign as part of a single GPBFT instance at any given point in time. This design ensures that tickets are checked for validity and issuer alignment, handling errors robustly in order to avoid self-equivocation during GPBFT instances.

## Improvements

Expand Down
94 changes: 82 additions & 12 deletions api/api_errors.go
Original file line number Diff line number Diff line change
Expand Up @@ -10,22 +10,57 @@ import (
const (
EOutOfGas = iota + jsonrpc.FirstUserCode
EActorNotFound
EF3Disabled
EF3ParticipationTicketInvalid
EF3ParticipationTicketExpired
EF3ParticipationIssuerMismatch
EF3ParticipationTooManyInstances
EF3ParticipationTicketStartBeforeExisting
EF3NotReady
)

type ErrOutOfGas struct{}
var (
RPCErrors = jsonrpc.NewErrors()

func (e *ErrOutOfGas) Error() string {
return "call ran out of gas"
}
// ErrF3Disabled signals that F3 consensus process is disabled.
ErrF3Disabled = errF3Disabled{}
// ErrF3ParticipationTicketInvalid signals that F3ParticipationTicket cannot be decoded.
ErrF3ParticipationTicketInvalid = errF3ParticipationTicketInvalid{}
// ErrF3ParticipationTicketExpired signals that the current GPBFT instance as surpassed the expiry of the ticket.
ErrF3ParticipationTicketExpired = errF3ParticipationTicketExpired{}
// ErrF3ParticipationIssuerMismatch signals that the ticket is not issued by the current node.
ErrF3ParticipationIssuerMismatch = errF3ParticipationIssuerMismatch{}
// ErrF3ParticipationTooManyInstances signals that participation ticket cannot be
// issued because it asks for too many instances.
ErrF3ParticipationTooManyInstances = errF3ParticipationTooManyInstances{}
// ErrF3ParticipationTicketStartBeforeExisting signals that participation ticket
// is before the start instance of an existing lease held by the miner.
ErrF3ParticipationTicketStartBeforeExisting = errF3ParticipationTicketStartBeforeExisting{}
// ErrF3NotReady signals that the F3 instance isn't ready for participation yet. The caller
// should back off and try again later.
ErrF3NotReady = errF3NotReady{}

type ErrActorNotFound struct{}
_ error = (*ErrOutOfGas)(nil)
_ error = (*ErrActorNotFound)(nil)
_ error = (*errF3Disabled)(nil)
_ error = (*errF3ParticipationTicketInvalid)(nil)
_ error = (*errF3ParticipationTicketExpired)(nil)
_ error = (*errF3ParticipationIssuerMismatch)(nil)
_ error = (*errF3NotReady)(nil)
)

func (e *ErrActorNotFound) Error() string {
return "actor not found"
func init() {
RPCErrors.Register(EOutOfGas, new(*ErrOutOfGas))
RPCErrors.Register(EActorNotFound, new(*ErrActorNotFound))
RPCErrors.Register(EF3Disabled, new(*errF3Disabled))
RPCErrors.Register(EF3ParticipationTicketInvalid, new(*errF3ParticipationTicketInvalid))
RPCErrors.Register(EF3ParticipationTicketExpired, new(*errF3ParticipationTicketExpired))
RPCErrors.Register(EF3ParticipationIssuerMismatch, new(*errF3ParticipationIssuerMismatch))
RPCErrors.Register(EF3ParticipationTooManyInstances, new(*errF3ParticipationTooManyInstances))
RPCErrors.Register(EF3ParticipationTicketStartBeforeExisting, new(*errF3ParticipationTicketStartBeforeExisting))
RPCErrors.Register(EF3NotReady, new(*errF3NotReady))
}

var RPCErrors = jsonrpc.NewErrors()

func ErrorIsIn(err error, errorTypes []error) bool {
for _, etype := range errorTypes {
tmp := reflect.New(reflect.PointerTo(reflect.ValueOf(etype).Elem().Type())).Interface()
Expand All @@ -36,7 +71,42 @@ func ErrorIsIn(err error, errorTypes []error) bool {
return false
}

func init() {
RPCErrors.Register(EOutOfGas, new(*ErrOutOfGas))
RPCErrors.Register(EActorNotFound, new(*ErrActorNotFound))
// ErrOutOfGas signals that a call failed due to insufficient gas.
type ErrOutOfGas struct{}

func (ErrOutOfGas) Error() string { return "call ran out of gas" }

// ErrActorNotFound signals that the actor is not found.
type ErrActorNotFound struct{}

func (ErrActorNotFound) Error() string { return "actor not found" }

type errF3Disabled struct{}

func (errF3Disabled) Error() string { return "f3 is disabled" }

type errF3ParticipationTicketInvalid struct{}

func (errF3ParticipationTicketInvalid) Error() string { return "ticket is not valid" }

type errF3ParticipationTicketExpired struct{}

func (errF3ParticipationTicketExpired) Error() string { return "ticket has expired" }

type errF3ParticipationIssuerMismatch struct{}

func (errF3ParticipationIssuerMismatch) Error() string { return "issuer does not match current node" }

type errF3ParticipationTooManyInstances struct{}

func (errF3ParticipationTooManyInstances) Error() string { return "requested instance count too high" }

type errF3ParticipationTicketStartBeforeExisting struct{}

func (errF3ParticipationTicketStartBeforeExisting) Error() string {
return "ticket starts before existing lease"
}

type errF3NotReady struct{}

func (errF3NotReady) Error() string { return "f3 isn't yet ready to participate" }
83 changes: 67 additions & 16 deletions api/api_full.go
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,7 @@ import (

blocks "github.com/ipfs/go-block-format"
"github.com/ipfs/go-cid"
"github.com/libp2p/go-libp2p/core/peer"

"github.com/filecoin-project/go-address"
"github.com/filecoin-project/go-bitfield"
Expand Down Expand Up @@ -910,24 +911,51 @@ type FullNode interface {

//*********************************** ALL F3 APIs below are not stable & subject to change ***********************************

// F3Participate should be called by a storage provider to participate in signing F3 consensus.
// Calling this API gives the lotus node a lease to sign in F3 on behalf of given SP.
// The lease should be active only on one node. The lease will expire at the newLeaseExpiration.
// To continue participating in F3 with the given node, call F3Participate again before
// the newLeaseExpiration time.
// newLeaseExpiration cannot be further than 5 minutes in the future.
// It is recommended to call F3Participate every 60 seconds
// with newLeaseExpiration set 2min into the future.
// The oldLeaseExpiration has to be set to newLeaseExpiration of the last successful call.
// For the first call to F3Participate, set the oldLeaseExpiration to zero value/time in the past.
// F3Participate will return true if the lease was accepted.
// The minerID has to be the ID address of the miner.
F3Participate(ctx context.Context, minerID address.Address, newLeaseExpiration time.Time, oldLeaseExpiration time.Time) (bool, error) //perm:sign
// F3GetCertificate returns a finality certificate at given instance number
// F3GetOrRenewParticipationTicket retrieves or renews a participation ticket
// necessary for a miner to engage in the F3 consensus process for the given
// number of instances.
//
// This function accepts an optional previous ticket. If provided, a new ticket
// will be issued only under one the following conditions:
// 1. The previous ticket has expired.
// 2. The issuer of the previous ticket matches the node processing this
// request.
//
// If there is an issuer mismatch (ErrF3ParticipationIssuerMismatch), the miner
// must retry obtaining a new ticket to ensure it is only participating in one F3
// instance at any time. If the number of instances is beyond the maximum leasable
// participation instances accepted by the node ErrF3ParticipationTooManyInstances
// is returned.
//
// Note: Successfully acquiring a ticket alone does not constitute participation.
// The retrieved ticket must be used to invoke F3Participate to actively engage
// in the F3 consensus process.
F3GetOrRenewParticipationTicket(ctx context.Context, minerID address.Address, previous F3ParticipationTicket, instances uint64) (F3ParticipationTicket, error) //perm:sign
// F3Participate enrolls a storage provider in the F3 consensus process using a
// provided participation ticket. This ticket grants a temporary lease that enables
// the provider to sign transactions as part of the F3 consensus.
//
// The function verifies the ticket's validity and checks if the ticket's issuer
// aligns with the current node. If there is an issuer mismatch
// (ErrF3ParticipationIssuerMismatch), the provider should retry with the same
// ticket, assuming the issue is due to transient network problems or operational
// deployment conditions. If the ticket is invalid
// (ErrF3ParticipationTicketInvalid) or has expired
// (ErrF3ParticipationTicketExpired), the provider must obtain a new ticket by
// calling F3GetOrRenewParticipationTicket.
//
// The start instance associated to the given ticket cannot be less than the
// start instance of any existing lease held by the miner. Otherwise,
// ErrF3ParticipationTicketStartBeforeExisting is returned. In this case, the
// miner should acquire a new ticket before attempting to participate again.
//
// For details on obtaining or renewing a ticket, see F3GetOrRenewParticipationTicket.
F3Participate(ctx context.Context, ticket F3ParticipationTicket) (F3ParticipationLease, error) //perm:sign
// F3GetCertificate returns a finality certificate at given instance.
F3GetCertificate(ctx context.Context, instance uint64) (*certs.FinalityCertificate, error) //perm:read
// F3GetLatestCertificate returns the latest finality certificate
// F3GetLatestCertificate returns the latest finality certificate.
F3GetLatestCertificate(ctx context.Context) (*certs.FinalityCertificate, error) //perm:read
// F3GetGetManifest returns the current manifest being used for F3
// F3GetManifest returns the current manifest being used for F3 operations.
F3GetManifest(ctx context.Context) (*manifest.Manifest, error) //perm:read
// F3GetECPowerTable returns a F3 specific power table for use in standalone F3 nodes.
F3GetECPowerTable(ctx context.Context, tsk types.TipSetKey) (gpbft.PowerEntries, error) //perm:read
Expand All @@ -936,6 +964,29 @@ type FullNode interface {
// F3IsRunning returns true if the F3 instance is running, false if it's not running but
// it's enabled, and an error when disabled entirely.
F3IsRunning(ctx context.Context) (bool, error) //perm:read
// F3GetProgress returns the progress of the current F3 instance in terms of instance ID, round and phase.
F3GetProgress(ctx context.Context) (gpbft.Instant, error) //perm:read
}

// F3ParticipationTicket represents a ticket that authorizes a miner to
// participate in the F3 consensus.
type F3ParticipationTicket []byte

// F3ParticipationLease defines the lease granted to a storage provider for
// participating in F3 consensus, detailing the session identifier, issuer,
// subject, and the expiration instance.
type F3ParticipationLease struct {
// Network is the name of the network this lease belongs to.
Network gpbft.NetworkName
// Issuer is the identity of the node that issued the lease.
Issuer peer.ID
// MinerID is the actor ID of the miner that holds the lease.
MinerID uint64
// FromInstance specifies the instance ID from which this lease is valid.
FromInstance uint64
// ValidityTerm specifies the number of instances for which the lease remains
// valid from the FromInstance.
ValidityTerm uint64
}

// EthSubscriber is the reverse interface to the client, called after EthSubscribe
Expand Down
Loading

0 comments on commit a0d5292

Please sign in to comment.